Each point on the UMAP plot represents a centroid of DreaMS embeddings (that is, mean embedding values across dimensions) of all MS/MS spectra acquired from a certain food sample measured on a QTOF instrument 75. Numbered points indicate selected example samples and refer to their textual descriptions assigned by the data collectors. The figure demonstrates that the space of sample-level embeddings correctly captures the taxonomy of food items presented to DreaMS as collections of MS/MS spectra. Specifically, the space is organized into three major regions predominantly populated with beverages (purple ellipse), plant food items (green ellipse) and animal food items (pink ellipse). Beverages are separated into milk beverages (orange) and other beverages (purple). Animal-based food items are divided into clusters comprising various dairy products (orange) and types of meat (pink). Plant-based food items show less distinction between categories and are primarily classified as vegetables (green), fruits (blue) and herbs and spices (gray). Individual categories (colors) were assigned to sample descriptions using ChatGPT 4 (ref. 76). The details are provided in Methods .