Sequencing data were collated from the American Gut Project and prior analyses of the AGEhIV cohort and were processed in identical fashion [ 46 , 48 , 52 ] using dada2 [ 56 ]. For both datasets, Canberra beta-diversity matrices were calculated, and PERMANOVA tests were performed to quantify significance and effect sizes of ecological distances between cases and controls for each disease. Sample sizes are shown in parentheses encompassing balanced cohorts of cases and controls matched for confounding variables displayed at top left. For HIV cohorts, PERMANOVA statistics were calculated on five total sample groups from two studies [ 46 , 52 ] including the following: men who have sex with men ( n = 76) [ 46 ], females ( n = 38) [ 46 ], men who have sex with women ( n = 34) [ 46 ], combined females and males (irrespective of sexual behavior (148) [ 46 ], and a separate cohort of men who have sex with men ( n = 102) [ 52 ]