ABSTRACT
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.1 Conventional k-Means Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.1.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.1.2 Three Frameworks for the Square-Error Criterion . . . . . . . . . . . . . . . . . . . 34
3.1.2.1 Naive Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.1.2.2 Approximation Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.1.2.3 Probabilistic Modeling Framework . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2 Equivalent Reformulations of k-Means Criterion . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.2.1 Anomalous Clustering Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.2.2 Inner-Product k-Means Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.2.3 Kernel-Wise Criterion of Maximization of the Semi-Averaged
Internal Similarities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.2.4 Spectral Rayleigh Quotient Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.3 Challenges for k-Means Criterion and Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.3.1 Properties of the Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.3.1.1 Convexity of Cluster Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.3.1.2 Monotonicity of the Minimum with Respect to the
Number of Clusters K . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.3.1.3 Dependence on the Feature Measurement Scales . . . . . . . . . . . . . . 43 3.3.1.4 Propensity to Balance the Sizes of Clusters . . . . . . . . . . . . . . . . . . 44 3.3.1.5 Square Error Versus Consensus . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.3.2 Different Clustering Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.3.3 Initialization of k-Means . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.3.4 Getting a Deeper Minimum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.3.5 Interpretation Aids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.3.6 Feature Weighting, Three-Stage k-Means, and Minkowski Metric . . . . . . . . 49
3.4 Notes on Software for k-Means . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
This chapter presents an updated review of k-means clustering, arguably the most popular clustering method. First, the square-error k-means criterion and method are introduced in three frameworks: a naive one, data recovery, and mixture of distributions. Then several
of
equivalent reformulations are given as leading to different local optimization strategies. A number of challenges and ways for addressing them are discussed as related to both the properties of solutions and ways for optimum finding. A few extensions are mentioned including fuzzy clustering and feature weighting.