chapter  6
22 Pages

Hierarchical Clustering

ByPedro Contreras, Fionn Murtagh

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 6.2 Input Data, Dissimilarity, Metric, and Ultrametric . . . . . . . . . . . . . . . . . . . . . . . 104 6.3 Introduction to Hierarchical Clustering Using the Single Linkage

Agglomerative Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 6.4 Agglomerative Hierarchical Clustering Algorithms . . . . . . . . . . . . . . . . . . . . . . 108 6.5 Efficient Hierarchical Clustering Algorithms Using Nearest

Neighbor Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 6.6 Density and Grid-Based Clustering Techniques . . . . . . . . . . . . . . . . . . . . . . . . . 114

6.6.1 Grid-Based Hierarchical Clustering Algorithms: A Survey . . . . . . . . . . . . 115 6.6.2 Linear Time Grid-Based Hierarchical Clustering . . . . . . . . . . . . . . . . . . . 116

6.7 Examples and Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 6.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

To begin with, we review dissimilarity, metric, and ultrametric. Next, we introduce hierarchical clustering using the single link agglomerative criterion. Then we present agglomerative hierarchical clustering in full generality. Storage and computational properties are reviewed. This includes the state-of-the art agglomerative hierarchical clustering algorithm that uses a nearest-neighbor chain and reciprocal nearest neighbors. We then review various, recently developed, hierarchical clustering algorithms that use density or grid-based approaches. That includes a linear time algorithm. A number of examples, and R implementation, completes this chapter.