ABSTRACT

Today’s data sets tend to be massive in terms of the number of observations n and the number of variables or characteristics p that we measure. Therefore, we often need to reduce the number of variables through dimensionality reduction or feature transformation before we can analyze our data. This will be the topic of the first two sections in this chapter, where we cover principal component analysis and multidimensional scaling. In the last section, we will present ways to visualize high-dimensional data before any transformation is done. These include scatter plot matrices, graphing in parallel coordinates, and Andrews’ curves.