ABSTRACT

Observations coming from several different sources, or populations, need different methodology and analysis than single batches of multivariate data. For each single batch, one may be concerned with description and analysis of the configuration of variables and of the scatter of units and with consideration of distributions, outliers, models, and other summarizations. The new aspect that appears when several batches of data are available is that of comparing batches. Problems arise with the search for, and identification of, characteristics on which the batches differ, with the measurement of "distances" between batches, with the appraisal of the significance or possible randomness of observed differences, and with the classification of additional units as being similar to one or another of the batches.