ABSTRACT

Key Notes https://www.niso.org/standards/z39-96/ns/oasis-exchange/table">

Background

Since 1980 genomes of increasing size have been sequenced. Before beginning a sequencing project of a genome it is necessary to produce good framework maps. This can be done by physical or genetic mapping. Originally restriction fragment length polymorphisms (RFLPs) were used, but these have been superseded by variable number tandem repeats (VNTRs), and single nucleotide polymorphisms that are now available in large numbers.

Genetic maps

Genetic maps are based on recombination frequencies between markers. Genetic maps are good for ordering genes, but because the frequency of recombination is not constant throughout the genome, they do not give accurate measurements of physical distances between markers. High definition genetic maps are required to map genes.

Physical maps

Physical maps are constructed by subdividing the genome into smaller pieces. The genes or DNA markers on each of these are then determined. Maps can also be produced by cloning genomic human DNA in specialized vectors. Yeast artificial chromosomes were of great importance in the manufacture of earlier physical maps, but are being replaced by bacterial artificial chromosomes (BACs) and P1 artificial chromosomes (PACs). The cloned inserts are organized into continuous arrays of overlapping fragments (contigs). These are then anchored to the framework map by use of sequence tagged sites. Somatic cell hybrids have been very useful in developing physical maps of certain mammalian species.

Sequence data

Two methods have been used to determine the sequence of long stretches of human DNA. Both are based on BACs. The first uses fingerprinting to map BACs into contigs. Each element of the contig is then sequenced and the sequences produced are joined together on the basis of the contig map. The alternative method sequences the BACs first, and then aligns the sequences to deduce the complete sequence of the chromosome. This has some problems dealing with repeated sequences. Both methods have been successful in producing extensive chromosomal DNA sequences.

Placing genes on the map

Genes can be placed on the framework map through three processes. Their recombination frequencies to known markers in the genetic framework map can be used to place them between markers. Known cDNAs can be mapped directly by probing contigs' DNA. This allows genes to be ascribed to a specific clone in a mapped contig. The same approach can be used with anonymous cDNA. These are known as expressed sequence tags (ESTs). This allows the mapping of genes whose function is still unknown. Direct analysis of DNA sequence can be used to identify novel genes, by searching for sequences characteristic of gene structures. Gene mining is the process of looking for new members of a specific gene family.

Genome comparison

Comparison of genomes from different species reveals information about the genes required at various levels of complexity and the evolution of different taxa. Massively parallel sequencing has allowed the coding regions of genomes from over 1000 humans to be sequenced. The data show that individuals carry 250 to 300 recessive mutations, and children have about 60 new mutations which their parents did not have.

Environmental sequencing

Environmental sequencing takes DNA from environmental samples such as seawater. The presence of more than a thousand species may be identified, including many unknown microbial species.

Related topics

Eukaryote genomes

Genetics in forensic science

Linkage

Ethics