Dimensionality Reduction | 7 | A Concise Introduction to Machine Learn

ABSTRACT

This chapter introduces techniques to find representations of the data samples with different features. It focuses on distinguishing between the digits zero and one. Principal Component Analysis calculates the covariance matrix of the data samples. The data samples lie in a lower dimensional set of the feature space. The Joint Photographic Experts group used in 1992 to create a standard for lossy image compression. Principal Component Analysis seeks a subspace of a given dimension of the feature space such that projections of the data samples onto that subspace are as spread out as possible. The Expectation-Maximization algorithm calculates the expectation of the logarithm of the joint data likelihood with respect to the conditional probability distribution of the latent variables. The kernel trick was introduced through a mapping of the feature space to a higher dimensional space, where the data samples are more easily separable.