ABSTRACT

Sample size reduction ensures a data reduction through the estimation of parametric or nonparametric models which preserve some data properties. Cardinality reduction includes for example binning processes which divide data into intervals and identify a representative feature value for each bin. Dimensionality reduction techniques allow for reducing the number of variables and divide into three types of approaches. An intuitive explanation of the curse of dimensionality is that in high-dimensional spaces there is exponentially more room than in low-dimensional spaces. There are many different techniques available to reduce the dimension of a dataset from the extrinsic dimension to the intrinsic one. Factor analysis is a linear method where all variables are represented as linear functions of some common factors which are not observable. Dimensionality reduction techniques have also been extended to the non-linear case.