ABSTRACT

L iving matter is rich in information content. The information is carried above all in the polymeric sequence of DNA, which may be transferred to RNA and protein through transcrip­ tion and translation. These polymers consist of more than one kind of monomer building blocks and the information is embedded in the sequence in which the different monomers are arranged and its length. Comparable heteropolymer information systems are also employed in the sequence of letters and words in languages and sequence of acoustic notes in music. The maximum information content 1^ of a sequence is given in bits by:

where M is the number of kinds of monomers and N the sequence length. Complex chemical compounds such as chlorophyll and heme store information in the spatial arrangement of their atoms and bonds, but these compounds are difficult to replicate and constrained by modest upper limits of information. In contrast, polymeric informa­ tion can be made practically unlimited by lengthening the sequence. The information content ofhuman genomic DNA, with four different monomers and a length of three billion basepairs, is:

Imax = 3x l0 9 Log2 (4) = 6 x l 0 9 bits

This amount of information is equivalent to a 400-volume manual for the construction of a human, 600 pages per volume and 5,000 printed characters, each carrying 5 bits, per page. It is not even near the upper lim it for DNA. The lung fish genome contains far more DNA than human.