A hybrid quantumclassical algorithm. On the quantum device, we first prepare the initial state and apply the already inferred protocol actions as gates. In the example above, the initial state is the fully z -polarized state, that is, {\left\vert 0\right\rangle }^{\otimes 4} , and two actions are performed: a global rotation around \hat{X} followed by a global two-qubit \hat{Y}\hat{Y} rotation. The resulting state \left\vert \psi \right\rangle represents the input to the QMPS network. The QMPS tensors θ Q = A (1) ⋯ A ( N ) can be mapped to unitary gates on a quantum circuit. To compute the Q -values, we first apply the inverse of the QMPS circuit unitary {U}_{\theta }^{{\dagger} } and measure the output in the computational basis. The fraction of all-zero measurement outcomes is an approximation to the fidelity | {\left\langle {\theta }_{Q}| \psi \right\rangle } |^{2} . Note that this denotes the fidelity with respect to the Q -value network state \left\vert {\theta }_{Q}\right\rangle and not the target quantum state which is not required during protocol inference. The fidelity estimates are then fed into the NN on a classical computer. From the resulting Q-values we can infer the next action and repeat these steps until the target state is reached.