ABSTRACT

A simple model is proposed for the sequence space. The observed frequencies of overlapping genes are then used to derive an estimate for the probability of obtaining a functional protein from a large pool of random sequences. The concept of the sequence space was introduced by Hamming, and its application to the protein sequence space was proposed by J. Maynard-Smith. Successive introduction of several point mutations into a given protein sequence defines a walk in the fitness space. An alternative approach to theoretical considerations is the design of a library of random gene sequences, and then attempting the selection of a biologically active protein from the pool of random polypeptides. A walk in the fitness space can represent, for instance, the evolution of a protein. There is experimental evidence that short random DNA or protein sequences can have a biological function. There are several technical obstacles in the construction of large random gene libraries.