ABSTRACT

Statistical data often arise from two or more different types or groups (classes, categories), where the grouping effect is known in advance. As mentioned in Section 1.3, a main root of chemometrics has been trying to solve chemical classification problems, especially the automatic recognition of substance classes from molecular spectral data (Crawford and Morrison 1968; Jurs et al. 1969) and the assignment of the origin of samples (Kowalski and Bender 1972). Such applications have been called PATTERN RECOGNITION IN CHEMISTRY (Brereton 1992; Varmuza 1980) before the term chemometrics has been introduced. Recently, classification problems are gaining increasing interest, for instance for the classification of technological materials using near infrared data (Naes et al. 2004) or in biochemistry, in medical applications, and in multivariate image analysis (Xu et al. 2007). The IDENTIFICATION of objects can be considered as a special case of classification, so to say with only one object in each group. An important task of this type is the identification of chemical compounds from spectral data; however, this topic will only be marginally touched in Section 5.3.3 (k-nearest neighbor [k-NN] classification).