ABSTRACT

Principal component analysis (PCA) originated in some work by Karl Pearson around the turn of the century, and was further developed in the 1930s by Harold Hotelling. PCA transforms a set of correlated variables to a new set of uncorrelated variables. PCA is a mathematical technique which does not require the user to specify an underlying statistical model to explain the 'error' structure. This chapter discusses the various objectives of PCA and assesses the benefits and drawbacks of applying the method in practice. An important benefit of PCA is that it provides a quick way of assessing the effective dimensionality of a set of data. If the first few components account for most of the variation in the original data, it is often a good idea to use these first few component scores in subsequent analyses. No distributional assumptions need to be made to do this, and one does not need to try to interpret the principal components en route.