ABSTRACT

This chapter presents an overview of the informatics tools available in the form of databases of pathway and biological network information, as well as software tools and methods by which one could integrate this information with global expression profiling experiments. Global gene expression profiling data, both at the RNA level (e.g., oligonucleotide array data) and at the protein level (e.g., two-dimensional gel electrophoresis or protein microarray derived data) represents a vast resource for exploration, or “data mining,” to uncover new insight into the complex biological systems under study. Yet the process of extracting information from such sources may be daunting. A useful strategy to extract meaningful information from a potentially overwhelming sea of expression data is to effectively link the data to external

sources of biological information, such as gene annotation or the biomedical literature. In particular, there is currently substantial information available pertaining to biological pathways that has utility in the interpretation of gene expression data. Moreover, numerous databases on biological pathways have recently become available that allow the results of global profiling experiments to be linked to these pathways to understand the observed differential expression of genes and proteins. Pathways and other biological networks may be represented as graphs, with nodes as the genes or proteins and edges as protein-protein or protein-DNA interactions. In functional pathway analysis, graph theoretic methods and graph visualization tools can be used to look for interesting patterns between pathway data and proteomic or genomic data; for example, a graph may display a large number of previously established connections between a set of genes or proteins found to be coregulated in a particular study.