ABSTRACT

Once a new gene has been located in a genome sequence, the question of its function has to be addressed. This is turning out to be an important area of genomics research, because completed sequencing projects have revealed that we know rather less than we thought about the content of individual genomes. Escherichia coli and Saccharomyces cerevisiae, for example, were studied intensively by conventional genetic analysis before the advent of sequencing projects, and geneticists were at one time fairly confident that most of the genes in these species had been identified. The genome sequences revealed that in fact there are large gaps in our knowledge. Of the 4288 protein-coding genes in the initial annotation of the E. coli genome sequence, only one-third were described as well characterized and 38% had no attributed function. The figures were very similar for S. cerevisiae. Methods that enable functions to be assigned to genes are therefore of critical importance in understanding a genome sequence.