ABSTRACT

Choosing the right method to answer a biological question is a key skill in statistical analysis. The mixOmics package offers a wide range of multivariate methodologies that employ dimension reduction and feature selection techniques to address different types of biological questions and guide further analytical investigations. This chapter introduces methods for single data set interpretation (e.g. PCA, PLS-DA), integrating more than one data set (e.g. PLS, rCCA, multi-block PLS-DA), and integrating data sets from different studies (multi-group PLS-DA). Both exploratory unsupervised and supervised methods are available, depending on whether the sample outcome known a priori is ignored or included in the model. Sparse variants of all techniques are proposed to identify the most relevant variables that explain an outcome of interest, or common information between data sets. This chapter describes generic and specific types of data that can be analysed in the package, and provides examples of biological questions each of the methods can answer. Finally, we outline exemplar data sets that are available in the package and are used in the case studies in the third part of this book.