ABSTRACT

The complexity of the computational protein design problem is very large (Park et al. 2004, 2005; Rosenberg and Goldblum 2006; Butterfoss and Kuhlman 2006). Even for a small protein of 100 residues, there are 20100 = 10130 different sequence possibilities. This collection of sequences, each constructed in a single copy, would occupy a space larger than the whole universe. Additional complexity arises if one tries to model protein exibility. It remains intractable to perform full-scale molecular dynamics simulations within the protein design calculation. Hence, most protein design studies consider only movement of protein side chains while the protein backbone remains xed. Flexibility of amino-acid side chains is typically modeled by using a discrete set of statistically signicant empirical conformations, called rotamers (see Chapter 13). With a larger number of rotamers used to represent each amino acid, the movement of side chains is modeled more accurately; but clearly the design problem becomes more complex. Even if only three rotamers were to be used for each amino acid, this exponentially increases the complexity of designing a 100-residue protein to 60100 = 10178 different rotameric assignments.