ABSTRACT

This chapter is concerned with the problems of classifying unclassified material, and begins with both the number of groups and their composition as unknowns. It provides a relatively brief account of three types of clustering methods: agglomerative hierarchical techniques, k-means clustering, and model-based clustering. The idea of sorting similar objects into categories is clearly a primitive one because early humans, for example, must have been able to realize that many individual objects shared certain properties such as being edible, or poisonous, or ferocious, and so on. Cluster analysis techniques are used to search for clusters/groups in a priori unclassified multivariate data. The impracticability of examining every possible partition has led to the development of algorithms designed to search for the minimum values of the clustering criterion by rearranging existing partitions and keeping the new one only if it provides an improvement.