Final single-particle fidelity {F}_{{{rm{sp}}}} = root N of {F} ( a ) and corresponding protocol length ( b ) versus the initial Ising ground state parameter values g x and g z . The target is a state in the critical region of the Ising model at ( J = +1, g x = 0.5, g z = 1.5). Training started only from initial states sampled randomly from the enclosed white rectangle. Each part of the colour bars is shown on a linear scale with the fidelity threshold ( {F}_{{{rm{sp}}}}^* = 0.97 , F * ≈ 0.61) and the maximum episode length during training (50), indicated by black lines.