ABSTRACT

Clustering in high-dimensional spaces is a recurrent problem in many domains, for example, in genotype analysis, object recognition, and so on, especially for model based methods. Due to the famous “curse of dimensionality”, model-based clustering methods are often over-parameterized with unsatisfactory performance for high dimensional data. Since high-dimensional data usually live in low-dimensional subspaces hidden in the original space, many efforts have been made to allow model-based methods to efficiently cluster high-dimensional data. In this chapter, we introduced several techniques to handle the clustering problems in high-dimensional spaces, including the mixture of factor analyzers, reduced projections based clustering method, subspace methods, regularized mixture modeling, and so on. The applications of several R packages are also discussed.