SIMBA graph construction and embedding in scATAC-seq analysis. Biological entities including cells, peaks or bins, TF motifs and k -mers are represented as shapes and colored by relevant cell types (green and orange). Non-informative features are colored dark gray. Cells and chromatin-accessible features (peaks/bins) are organized into a cell × peak/bin matrix. When sequence information (TF motif or k -mer sequence) within these regions is available, they can be organized into two sub-matrices to associate a TF motif or k -mer sequence with each peak or bin. These constructed feature matrices are then binarized and assembled into a graph. When a single feature (chromatin accessibility) is used, the graph encodes cells and peaks/bins as nodes. When multiple features (both chromatin accessibility and DNA sequences) are used, this graph may then be extended with the addition of TF motifs and k -mer sequences as nodes. Finally, SIMBA embeddings of these entities are generated through a graph embedding procedure.