ABSTRACT

Proteins are the major players in molecular recognition at the heart of all processes of life. They interact with the other components of the cell, small molecules, nucleic acids, membranes, and other proteins to build supramolecular assemblies and elaborate molecular machines that perform all sorts of functions, from chemical catalysis and mechanical work to signaling and regulation (Alberts, 1998). Proteinprotein recognition is the mechanism by which the specific interaction between polypeptide chains creates functional units. Its study has been part of biochemistry, structural biology, and computational biology for more than 30 years, and it has now spread to all domains of biology and medical science (Eisenberg et al., 2000). Protein-protein recognition must be given a chemical and physical basis, which in practice requires high-resolution three-dimensional structures. The Protein Data Bank (PDB; Berman et al., 2000) contains that information for several hundreds of protein assemblies, mostly transient binary complexes and oligomeric proteins. Cells contain plenty of larger assemblies, still poorly represented in the PDB, with the exception of the icosahedral viruses, ribosomes, and a few others (Dutta & Berman, 2005). Their analysis is the next frontier in our understanding of molecular recognition in biology.