ABSTRACT

The usefulness of sequence databases for the biological sciences needs no defense. As a simple resource, sequence databases are drawn upon by scientists from a broad range of disciplines for a wide variety of needs. The databases themselves generate new knowledge. A partial sequence was published for platelet-derived growth factor, and a search of a protein sequence database revealed that it was almost identical with a portion of an oncogene known as v-sis. Sometimes sequence match-ups show that only portions of one protein are similar to another. The realization that all of biology is based on an enormous redundancy has extraordinary implications for the sequencing of the human genome. For nowhere in the biological world is the Darwinian notion of "descent with modification" more apparent than in the sequences of genes and gene products. A small number of starter types have been expanded by the general mechanism of duplication and subsequent modification, with a still-to-be-determined amount of intergenic shuffling.