A synthetic dataset consisting of n = 50,000 samples p = 1,000 normally distributed features was generated. Some features are correlated with the outcome (informative features, light blue), whereas the others are not (uninformative features, gray). Forty thousand samples are held out for validation. Out of the remaining 10,000, 50 sets of sample sizes n ranging from 50 to 1,000 are drawn randomly to assess model performance. The Stabl SRM framework is used using Lasso (Stabl L ) with MX knockoffs for noise generation. Performances are tested on continuous outcomes (regression tasks).