To build the AMPSphere, we first assembled 63,410 publicly available metagenomes from diverse habitats. A modified version of Prodigal, which can also predict smORFs (30–300 bp), was used to predict genes on the resulting metagenomic contigs as well as on 87,920 microbial genomes from ProGenomes2. Macrel was applied to the 4,599,187,424 predicted smORFs to obtain 863,498 non-redundant c_AMPs (see also Figure S1). c_AMPs were then hierarchically clustered in a reduced amino acid alphabet using 100%, 85%, and 75% identity cutoffs. We observed 118,051 non-singleton clusters at 75% of identity, and 8,788 of them were considered families (≥8 c_AMPs).