ABSTRACT

Reducing the number of observations is an important task for describing and understanding the sample at hand. Reducing the number of observations may be needed in order to obtain profiles, which are then used to build policies. The central idea is that heterogeneity in the sample is mostly arising from the presence of very few (say, k) homogeneous groups, as compared to the sample size n. In most cases, this operation is completely unsupervised, that is, groups are not known in advance for any of the subjects. This is the case of cluster analysis. In cluster analysis, group labels obtained after cluster analysis are per se meaningless. Careful interpretation of cluster profiles helps the user characterize the (optimal) groups. Cluster profiles, to be more carefully defined below, can be seen as the reduced sample and possibly used as a new data set if needed.