ABSTRACT

The dimensionality is an explicit factor for the computational cost of many algorithms. There are several different ways to do dimensionality reduction. The first is feature selection, which typically means looking through the features that are available and seeing whether or not they are actually useful, i.e., correlated to the output variables. The second method is feature derivation, which means deriving new features from the old ones, generally by applying transforms to the dataset that simply change the axes of the graph by moving and rotating them, which can be written simply as a matrix that we apply to the data. The former shows two-dimensional data from an ellipse being mapped into one principal component, which lies along the principal axis of the ellipse. Like Principal Components Analysis, Multi-Dimensional Scaling tries to find a linear approximation to the full dataspace that embeds the data into a lower dimensionality.