ABSTRACT

A common method of data reduction is principal component analysis (PCA). The approach of PCA is equivalent to shifting and rotating the coordinates of the space. The computation of the principal components uses the eigenvectors of the covariance matrix. It is necessary to review the covariance matrix and the importance of eigenvectors before proceeding to the determination of the principal components. The dimensionality can be reduced when the location of the data points along one coordinate is similar to another or a linear combination of others. The data redundancy becomes evident in the covariance matrix which has the ability to indicate which dimensions are dependent on each other. Mathematical representation of data in matrices does tend to oppose the representation preferred in Python scripting. In theory, data is contained as columns in a matrix, whereas Python scripts tend to store data as rows in a matrix.