The fraction of valid SMILES indicates the fraction of generated SMILES that can successfully be parsed using RDKit (note that it does not plateau at 0, but approximately 0.1) 63 . We then determine the fraction of those that are already part of the training set and find that at low temperature GPT-3 tends to restate molecules from the training set. To quantitatively capture the similarity of the distribution of the generated molecules to the ones from the training set, we compute the Frechet ChemNet distance 64 , which quantifies both diversity and distribution match 51 and goes through a minimum at intermediate temperatures. For quantifying how well the generated molecules match the desired transition wavelengths, we use the GPR models reported by ref. 43 to predict the transition wavelengths. The dashed horizontal lines indicate those models’ mean absolute error (MAE). Across all temperatures, we found high average synthesizability (synthetic accessibility, SA, score 44 smaller than 3). Error bands indicate s.e.m.