ABSTRACT

Faced with a complex compositional data set with many components, a major simplification can be achieved by trying to identify which logratios are the most important drivers underlying the data structure. This subject has been partially addressed in Chap. 5, where logratios were identified that were the most significant predictors of a continuous or categorical response variable. In the present chapter, however, the whole compositional data set is considered as a response set, and logratios constructed from the same data set itself are sought that explain a maximum part of the total logratio variance. The result is that the original data set can effectively be replaced by a small set of logratios, which can then be treated as regular interval-scale statistical variables. Since this selected set of logratios involves a subset of the compositional parts, this approach can also be considered a way of choosing the most important subcomposition underlying the data.