ABSTRACT

Understanding gene regulation through the underlying transcriptional regulatory networks (referred to as TRNs in the following) is a central topic in biology. A TRN can be thought of as consisting of a set of proteins (known as transcription factors), genes, small modules, and their mutual regulatory interactions. The potentially large number of components, the high connectivity among various components, and the transient stimulation in the network result in great complexity of TRNs. With the rapid advances of molecular technologies and enormous amounts of data being collected, intensive efforts have been made to dissect TRNs using data generated from the state-of-the-art technologies, including gene expression data and other data types (e.g., Chu et al. 1998; Ren et al. 2000; Davidson et al. 2002; Lee et al. 2002; BarJoseph et al. 2003; Zhang and Gerstein 2003). The computational methods include gene clustering (e.g., Eisen et al. 1998; Roberts et al. 2000), Boolean network modeling (e.g., Liang et al. 1998; Akutsu et al. 1999, 2000; Shmulevich et al. 2002), Bayesian network modeling (e.g., Friedman et al. 2000; Hartemink et al. 2001, 2002), differential equation systems (e.g., Gardner et al. 2003; Tegne´r et al. 2003), information integration methods (e.g., Gao et al. 2004), and other approaches. For recent reviews, see de Jong (2002) and Sun and Zhao (2004). As discussed in our review (Sun and Zhao, 2004), although a large number of studies are devoted to infer TRNs from gene expression data alone, such data provide only a very limited amount of information. On the other hand, other data types, such as protein-DNA interaction data (which measure the binding targets of each transcription factor (TF) through direct biological experiments) may add more information and should be combined together for network inference.