ABSTRACT

The algorithms, the authors have described up to now are examples of a general approach referred to as supervised machine learning. Hierarchical clustering starts by defining each observation as a separate group, then the two closest groups are joined into a group iteratively until there is just one group including all the observations. To generate actual groups, the people can do one of two things: decide on a minimum distance needed for observations to be in the same group or decide on the number of groups the people want and then find the minimum distance that achieves this. In general, the choice of how to fill in missing data, or if one should do it at all, should be made with care. If the information about clusters in included in just a few features, including all the features can add enough noise that detecting clusters becomes challenging.