ABSTRACT

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 5.1 Approximation Algorithms for k-Means and k-Median . . . . . . . . . . . . . . . . . . . . . 68 5.2 Lloyd’s Method for k-Means . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 5.3 Properties of the k-Means Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 5.4 Local Search-Based Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

5.4.1 Bounding the Cost of a Swap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 5.4.2 Adding It All Up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

5.5 Clustering of Stable Instances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 5.5.1 -Separability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 5.5.2 Proof Sketch and Intuition for Theorem 5.5 . . . . . . . . . . . . . . . . . . . . . . . . 79 5.5.3 Proof Sketch and Intuition for Theorem 5.6 . . . . . . . . . . . . . . . . . . . . . . . . 81 5.5.4 Approximation Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 5.5.5 Proof Sketch and Intuition for Theorem 5.7 . . . . . . . . . . . . . . . . . . . . . . . . 83 5.5.6 Other Notions of Stability and Relations between Them . . . . . . . . . . . . . . . 85 5.5.7 Runtime Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 5.5.8 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

5.5.8.1 Variants of the k-Means Objective . . . . . . . . . . . . . . . . . . . . . . . . . 88 5.5.8.2 k-Means++: Streaming and Parallel Versions of k-Means . . . . . . . . 88 5.5.8.3 Approximation Stability in Practice . . . . . . . . . . . . . . . . . . . . . . . 89

5.6 Mixture Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 5.6.1 Clustering Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

5.6.1.1 Handling Smaller C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 5.6.1.2 General Gaussians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

5.6.2 Spectral Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 5.6.2.1 Algorithmic Intuition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

5.6.3 Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 5.6.3.1 The Case of Two Gaussians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 5.6.3.2 Reduction to a One-Dimensional Problem . . . . . . . . . . . . . . . . . . . 95 5.6.3.3 Solving the One-Dimensional Problem . . . . . . . . . . . . . . . . . . . . . 95 5.6.3.4 Solving the Labeling Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

5.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

of

In the first part of this chapter, we detail center-based clustering methods, namely methods based on finding a “best” set of center points and then assigning data points to their nearest center. In particular, we focus on k-means and k-median clustering which are two of the most widely used clustering objectives. We describe popular heuristics for these methods and theoretical guarantees associatedwith them.We also describe how to designworst case approximately optimal algorithms for these problems. In the second part of the chapter, we describe recent work on how to improve on these worst case algorithms even further by using insights from the nature of real world clustering problems and datasets. Finally, we also summarize theoretical work on clustering data generated from mixture models such as a mixture of Gaussians.