Example outcome of latent embedding and differential analysis with different reference designs. Left, uniform manifold approximation and projection (UMAP) embedding of scVI latent space learned on the embedding reference dataset. Points are colored according to cell type clusters (as in a ); the icons in the top left corner indicate the type of embedding reference dataset used. Center, UMAP embedding of cells from the differential analysis reference and disease datasets on scVI latent space learned from the embedding reference dataset, colored according to type of dataset and to highlight (in pink) the OOR cell state. For the CR design, we differentiated between latent embedding with query mapping (CR scArches) and embedding in one step, training an scVI model on the concatenated control and disease dataset (CR scVI). Right, Milo neighborhood graph visualization of DA testing results: each point represents a neighborhood, and points are colored according to the log fold change (logFC) in cell abundance between disease and reference cells. Only neighborhoods where significant enrichment in disease cells (10% spatial FDR and log fold change > 0) was detected are colored. Points are positioned based on the coordinates in the UMAP embedding of the neighborhood index cell; the size of points is proportional to the number of cells in the neighborhood. The horizontal dashed lines are used to separate the phases of the workflow.