ABSTRACT

What are the design principles of biological networks? Are there specific constraints that operate on their structure as they evolve? What are the mechanisms of network evolution and what properties of biological networks do these mechanisms preserve? Questions such as these are difficult to answer in general, but have led to a flurry of activity in the area of building model networks that mimic the large-scale connectivity properties of networks observed in nature. In order to compare model networks to real biological networks, we consider here various global statistics that describe network structure, including the degree distribution, clustering coefficients, and mean path lengths. As we have discussed in earlier chapters, many biological interactions

at the molecular level consist of physical binding events and/or chemical reaction events. We know that one way a gene regulatory interaction occurs is when a transcription factor protein binds to the promoter region of the gene that is being transcribed. A protein-protein interaction consists of physical binding of two proteins to form a complex. In a metabolic network, a directed interaction edge signifies the occurrence of a chemical reaction. In more abstract networks, on the other hand, edges may not have such a transparent interpretation but may still arise out of multiple binding/reaction mechanisms. For example, two genes that have a synthetic lethal interaction frequently lie on separate pathways of serial protein-protein interactions that are connected to each other at one end point. The physico-chemical events that underlie a network are suscepti-

ble to changes due to gene mutations. Mutations alter the gene sequences (and protein structures) of the fundamental players (nodes) and therefore change binding energies and reaction rates. Mutations can also alter DNA sequences where proteins (such as transcription factors) may bind to regulate a gene. A series of mutations in such noncoding DNA regions accumulated during the process of biological evolution will therefore lead to rewiring of the network due to altered

transcription factor binding. Old binding sites may be lost. New sites can emerge. Furthermore, genes can be duplicated over the course of evolution and the number of such duplicates can change as the organism evolves. Under this process, multiple genes may carry the same biological function and therefore the same sets of network connections. These multiple genes, in turn, are independently susceptible to evolutionary divergence by mutations, thus leading to further rewiring of the network. Such considerations lead to a set of putative rules for the evolutionary dynamics of a network, consisting of a mixture of node duplication, node specification, and rewiring processes. The precise mechanisms that determine how these processes are manifested at the network level, as well as the relative frequencies of occurrence of each process, are largely unknown, and different assumptions about the nature of these processes and their frequencies lead to strikingly different types of model networks. In this chapter, we lay the foundations for modeling network growth

and evolution by using well-studied theoretical models as examples. The aim here is to understand the origins of biological complexity at the network level, as well as to use the models to extract information about how evolution operates at the network level. To gain insight into these models, we begin with a discussion of regular, random, and small-world networks, the way they are “wired,” and their properties. Because large networks are complex objects, it is important to extract a few statistical measures that capture their overall properties in order to facilitate comparison of real networks with model networks. The statistical properties that we focus on here are the mean path length through the network and the average clustering coefficient of the network. We further discuss various classes of models of network evolution,

including biophysical models that represent first steps in bringing large network models into the familiar domain of known molecular interaction mechanisms. We then turn to the question of evolutionary insights that we obtain when model networks and model network mechanisms are confronted with real data. Unless otherwise stated, we assume in this chapter that all networks under consideration are connected networks or connected components of larger, disconnected networks (Box 7.1).