ABSTRACT

One way of studying protein structure and function is to carry out site-directed mutagenesis where specific residues within a protein are altered, and then to examine the effects of these changes on protein characteristics. Changes in the amino acid (aa) properties (e.g., hydrophobicity, volume, and charge) of the mutated sites can then be correlated with changes in protein characteristics [2]. Another approach is to analyze large families of naturally occurring proteins or protein domains. During divergent evolution, protein sequences change through genetic drift, while the biochemical function of the protein is substantially retained. It is known that the number of sequences exceeds the number of structures by several orders of magnitude and, therefore, the number of threedimensional protein structures corresponding to a given function is small, from one (like the hand-shaped structure of nucleic acid polymerases) to a small number (for example the four families of endoproteases) [3,4]. The core conformation of homologous proteins persists long after the statistically significant sequence similarities have vanished [5] and this persistence underlies all

tools where a function is predicted or a three-dimensional model of a protein is built by extrapolation from an experimental structure of a homologue sequence [6].