Percentage of reads per sample for sequence-variant correct calls and miscalls present in a set of 235 synthetic mock infections of different complexity (from 1 to 10 clones per infection mixed in different molar ratios to emulate possible infection scenarios found in clinical isolates). Read coverage was measured in the first and second iterations of the analysis workflow, showing that the noise reduction module introduced between the two iterations significantly reduces the percentage of reads in the sample assigned to the three most frequent size-variant miscalls (Mann-Whitney pairwise comparison, ****p < 0.0001). Mean and SD are shown. Based on these, thresholds for variant calling were set to the mean coverage of the most common miscall plus two standard deviations, these being 14% and 16% for size and sequence variants (shown as dotted lines in C and D), respectively. See alsoTable S1.