ABSTRACT

As soon as two protein sequences need to be compared, a task known as alignment arises. Alignment involves matching the amino acids of the two sequences in such a way that their similarity can be determined best. For example, if the two sequences are CDFG and CDEFS, it is clear that the alignment

C D F G C D E F S

provides the best assessment of their similarity. However, in comparing two considerably different protein sequences of, for instance, 300 amino acids each, the real problem begins. Ideally, the alignment of two sequences should be in agreement with their evolution, i.e., the patterns of descent as well as the

MD: KONOPKA, JOB: 04359,

MD: KONOPKA, JOB: 04359, PAGE:

molecular structural and functional development. Unfortunately, the evolutionary traces are often very difficult to detect. For example, in divergent evolution of two protein sequences from a common ancestor, amino acid mutations, insertions, and deletions of residues, gene doubling, transposed gene segments, repeats, domain structures, and the like can blur the ancestral tie beyond recognition.