ABSTRACT

The problem of discovering new taxonomies (classifications of objects according to some natural relationships) from data has received considerable attention in the statistics and machine learning community. In this chapter, we are concerned with a particular type of taxonomy discovery, namely, cluster analysis, the discovery of distinct and nonoverlapping subpopulations within a larger population, the member items of each subpopulation sharing some common features or properties deemed relevant in the problem domain of study. This type of unsupervised analysis is of particular significance in the emerging field of functional genomics and microarray data analysis.