Methodological procedure of the work presented here. The full dataset was divided into three smaller datasets. Survival analysis was performed to each dataset to evaluate how stages of the disease (II vs. III), sidedness of primary tumor site in colon (Right vs. Left), and class (Pprimary patients that do not metastasize vs. PMprimary patients that metastasize) are related to risk of death. Afterwards, three different approaches to classify early-stage patients that metastasize were used: (1) Classifiers without regularization (DT decision trees, svmLlinear support vector machine, svmRradial support vector machine, LRlogistic regression and RFrandom forest) applied to subset of genes that were found differentially expressed between two groups (P vs. PM); (2) Regularized logistic regression performed on the full dataset using two different penalization factors (ENelastic net, and iTwiner); (3) Classifiers applied to genes pre-selected by regularized logistic regression. Model performance was compared using different types of measures (e.g., accuracy and misclassifications)