ABSTRACT

The concept of principal component analysis (PCA) is mostly credited to Karl Pearson (1901) and, later, to Harold Hotelling (1933), who developed a technique to reduce the complexity of a given set of variables. To achieve this goal, PCA constructs uncorrelated so- called principal components (PCs) as linear combinations of the original variables that successively have maximum variance. Certain uncorrelated principal components are then chosen so that the number of components is lower than the number of initial variables and so that data approximation is adequate with regard to the preservation of maximum overall variance. This approach offers the possibility of reducing the complexity of large data sets while minimizing the loss of information. The procedure can be seen as a transformation of the data into a space with fewer dimensions. Sometimes the principal components themselves are the focus of research and sometimes they are used for further data analyses (e.g., regression analysis or cluster analysis). PCA shares its main goal of reducing complexity with the method of factor analysis (FA), and these two methods use similar mathematical calculations; however, PCA and FA differ in some respects, the importance of which is a matter of controversy within the scientific community. The most important difference might be that PCA is a more adaptive, explorative, and descriptive method and, unlike FA, does not necessarily require a distinct model.