Box plot of intra- and interclass Tanimoto similarity calculations of Morgan fingerprints (ECFC4) following Tukey’s definitions with outliers 83 . Centre line, median; box limits, upper and lower quartiles; whiskers, 1.5× interquartile range; and points, outliers. The dashed line indicates the 95th percentile median (0.23) of random reference compound subsets. For full cross-similarity values, see Supplementary Figs. 9 –10 .