ABSTRACT
As mutation and selection drive evolution, DNA and genes change over time. Due to
these forces, 2 species with a common ancestor will have similar, but non-identical
genes in terms of base-pair sequence. This basic fact can be used for several pur-
poses. For example, suppose you know the DNA sequence for a human gene but are
unsure of the function of the gene. If you can find a gene with a similar sequence in
a closely related species, then a reasonable conjecture is that the functions of the 2
genes are the same. Another application of this basic idea is in constructing phyloge-
netic trees. These are graphical representations that show how closely related a set of
species are to one another. Such considerations often have medical applications: for
example, by knowing how similar different strains of human immunodeficiency virus
(HIV) are to one another we may have some idea about how effective a vaccine will
be that has been designed using a certain strain. Yet another common application is
in the recognition of common sequence patterns in DNA (the motif recognition prob-
lem). Recognizing common patterns in DNA or proteins can help determine where
genes are in a sequence of DNA or proteins. We can also try to find locations where
certain proteins bind to DNA if we suppose there is some sequence of DNA that is
recognized by a protein.