For each virus-like sequence, the percentage of positive MDCK cells is plotted against the percentage of positive macaque cells. Virus IDs were categorized as ‘shared’, ‘macaque only’, ‘MDCK only’ and ‘undefined’ as described in the Methods . The blue fill along the x = y line depicts the ‘shared’ category thresholds. The dashed lines indicate the minimum percentage of positive cells for virus IDs to be included in further analysis (≥0.05%, which equals ~100 macaque cells). The insert shows the same plot without log scale axes such that zero counts are included. A red edge marks contaminating virus-like sequences also observed in sequencing data obtained from blank sequencing libraries containing only sterile water and reagent mix (Extended Data Fig. 8a ).