ABSTRACT

Learning goal: You can use functions to organize your programs better. 10.1 IN THIS CHAPTER YOU WILL LEARN

• How to write your own functions

• How to extract the sequence from the coordinates of a protein structure

• How to selectively extract information from a PDB le

• How to calculate the distance between atoms in a protein or DNA three-dimensional structure

10.2.1 Problem Description

Protein or nucleotide three-dimensional (3D) structures are stored in PDB les. PDB les are text les that contain both annotation and atomic coordinates (x,y,z) of biological molecules. Crystallographers or NMR spectroscopists collect this information from structure determination experiments and submit it to the Protein Data Bank (https://www.rcsb.org), which contains about 88,000 structures at the time of writing. e format

of PDB les is described in Box 10.1, and a sample is shown in Appendix C, Section C.6, “An Example of a PDB File Header (Partial),” and Section C.7, “An Example of PDB File Atomic Coordinate Lines (Partial).”