ABSTRACT

This chapter investigates statistical analysis approaches, first highlighting the difference between exploratory and confirmatory methods. Many of the aspects of population structure, some of which are not obvious and even surprising, that can be found in an exploratory analysis are demonstrated with the Tilted Parabolas and the Twin Arches toy data sets, and with Lung Cancer and Pan Cancer subsets of The Cancer Genome Atlas. While such methods are very good at revealing important population structure, they also have strong potential to find spurious artifacts of sampling variation, that are not reproducible in independently generated data sets. Hence, methods that confirm the actual existence of discovered phenomena are also critical. Pointers are given to detailed development of such methods in later chapters. There is also an overview of major OODA statistical tasks.