ABSTRACT

The primary structure of a protein is the exact sequence of the amino acids joined by the peptide bond in a given polypeptide. In multichain proteins, each chain is referred to separately. By convention the primary structure is reported using threeletter or one-letter amino-acid coding (see Chapter 1), starting from the aminoterminal (N) end and nishing at the carboxyl-terminal (C) end. The primary structure also requires specifying the crosslinking cysteines involved in the protein’s disul-de bonds as well as all the posttranslational modications in the polypeptide chain (see Chapter 5). Frequently, the modications include cleavage of the polypeptide chain; therefore, the sequence of the polypeptide does not always correspond to the sequence of its mature mRNA. The linear polypeptide chain folds in a particular arrangement, giving a dened three-dimensional structure. Proteins unfolded in vitro fold back to their original native state when the solution conditions are returned to those in which the folded protein exists. All the information for the native threedimensional structure of the protein appears, therefore, to be contained within the primary structure. On this basis, a prediction of the higher structures of the proteins and their functions is possible (see Chapter 19). Proteins are self-folded by a process called self-assembly; however, in vivo the polypeptide folding is often assisted by molecular chaperones. The assembly of the primary structure of peptides and proteins, and the determination of the sequence are discussed in Chapter 4.