ABSTRACT

As mutation and selection drive evolution, DNA and genes change over time. Due to

these forces, 2 species with a common ancestor will have similar, but non-identical

genes in terms of base-pair sequence. This basic fact can be used for several pur-

poses. For example, suppose you know the DNA sequence for a human gene but are

unsure of the function of the gene. If you can find a gene with a similar sequence in

a closely related species, then a reasonable conjecture is that the functions of the 2

genes are the same. Another application of this basic idea is in constructing phyloge-

netic trees. These are graphical representations that show how closely related a set of

species are to one another. Such considerations often have medical applications: for

example, by knowing how similar different strains of human immunodeficiency virus

(HIV) are to one another we may have some idea about how effective a vaccine will

be that has been designed using a certain strain. Yet another common application is

in the recognition of common sequence patterns in DNA (the motif recognition prob-

lem). Recognizing common patterns in DNA or proteins can help determine where

genes are in a sequence of DNA or proteins. We can also try to find locations where

certain proteins bind to DNA if we suppose there is some sequence of DNA that is

recognized by a protein.