Runtime (top), memory (middle) and L2 error (bottom) of each method (colors) in the ‘mixture’ (left; with bead size = 1.0), ‘dropout’ (middle; with drop midpoint = -1.0) and ‘differentiation’ (right) use cases, at different numbers of observations (cells or beads, x axis), ranging from 2 10 ~ 1 k to 2 20 ~ 1 M, which are up/down sampled from the datasets in Fig. 2 . Reference size is fixed at 2 14 ~ 16 k. The maximum compute resources per run are 8 CPU cores for 8 h with 8 GB memory each. Missing data points indicate that either compute time or memory was insufficient to complete the annotation. Methods with an asterisk do not natively return (fractional) annotations of spatial measurements, which leaves the total annotation fractions in the spatial measurement as degrees of freedom. The wrapper fills that using the reference type fractions. Categorical annotation methods are dashed.