Line Plot from Scientific Research

Citation
The random forest model was trained using the abundance of fecal metagenomics species, the level of serum cytokines, and targeted neurotransmitters in our cohort. All variables were first ranked based on their variable importance and then added sequentially into the model. The error curves were plotted for the five trials of tenfold cross-validation in random forest classification as the number of variables increased. The black curve indicates the average cross-validation error of the five trials (in gray). The minimum error in the averaged curve plus the standard deviation at that point was used as the cutoff for feature selection. The model containing the smallest number of variables with an error below that cutoff was chosen as the optimal classifier. The red line marks the number of variables in the optimized model ( n = 10).
Related Plots
Browse by Category
Popular Collections
Discover More Scientific Plots
Browse thousands of high-quality scientific visualizations from open-access research