ABSTRACT

Machine learning (ML) is the blend of computer science and statistics with the goal of learning from the data and predicting on unseen data. When multiple biomarkers are present in a data set, all being less or more important, it is possible to use ML methods to combine the biomarkers to produce a robust and sensitive signature, which predicts the clinical outcome or the treatment effect. ML is one of the most rapidly growing areas in computational sciences. The role of modeling interactions is important in pharmaceutical statistics. Classification and regression trees procedures are described as tools for building interpretable interactions between variables for explaining the outcome in biomarker data sets. Graphical models have been gaining interest in modeling biomedical data. These models could reveal important variables and their interrelationships in multivariate biomarker data sets.