ABSTRACT

The past 15 years have witnessed remarkable developments in the nature and volume of genetic variation data that are available for statistical genetic analysis. Genetic variation generally refers to differences between individuals in the DNA that is inherited from parents. Normally, identical DNA is contained in each of our cells and does not change during a lifetime; it is organized as a string of paired nucleotides for each of 22 chromosomes (autosomes), plus the X and Y sex-chromosomes. A base-pair refers to a pair of nucleotides at a specific position. Simply speaking, a gene can be defined as a set of specific DNA instructions that code an RNA or protein product. These in turn can affect the development of physical features as well as the production of proteins and metabolites that have biological consequences and may eventually play a role in disease causation and physiological variation. The genome refers to all of a person’s nucleotides across the chromosomes, and is comprised of 3 billion nucleotides in total, indexed by base-pair position, with roughly 2% within the coding regions of genes, known as exons. As a first step toward discovering and characterizing the role of genes, genetic analysis of DNA variation investigates relationships of specific DNA variants with measurable human traits.