ABSTRACT

The two most widely studied clustering algorithms are partitional and hierarchical clustering. These algorithms have been heavily used in a wide range of applications primarily due to their simplicity and ease of implementation relative to other clustering algorithms. Partitional methods need to be provided with a set of initial seeds which are then improved iteratively. Hierarchical clustering can be achieved in two different ways, namely, bottom-up and top-down clustering. Hierarchical methods can start off with the individual data points in single clusters and build the clustering. Hierarchical clustering algorithms approach the problem of clustering by developing a binary tree-based data structure called the dendrogram. Once the dendrogram is constructed, one can automatically choose the right number of clusters by splitting the tree at different levels to obtain different clustering solutions for the same dataset without rerunning the clustering algorithm again.