ABSTRACT

Hierarchical clustering produces groups of similar data points at different levels of similarity. This chapter introduces a bottom-up procedure of hierarchical clustering, called agglomerative hierarchical clustering. A list of software packages that support hierarchical clustering is provided. Some applications of hierarchical clustering are given with references. The chapter describes four methods: average linkage, single linkage, complete linkage, and centroid method. Various methods of determining the distance between two clusters have different computational costs and may produce different clustering results. For example, the average linkage method, the single linkage method, and the complete linkage method require the computation of the distance between every pair of data points from two clusters. The centroid linkage method can produce a nonmonotonic tree in which the merging distance for a new cluster can be smaller than the merging distance for a cluster that is formed before the new cluster.