ABSTRACT

The first step of the engineering approach is to use the data produced by experiments to construct mathematical models for the systems of interest. With the advent of high-throughput technologies such as DNA microarrays, it is now possible to measure expression levels of thousands of mRNA targets simultaneously. Given this experimental data and an abstract class of potential models for genetic circuit configurations, how can one decide the most likely circuit that generated this data? This chapter begins by briefly describing modern experimental methods

in Section 2.1 and then a model for representing experimental data in Section 2.2. Next, Section 2.3 describes cluster analysis which groups genes together that seem to be expressed at the same time and are potentially related in function. Clustering algorithms, however, do not indicate which genes activate or repress other genes. Therefore, Section 2.4 presents learning methods for Bayesian networks which can potentially determine how genes interact. Bayesian methods, however, have trouble learning causal relationships, so Section 2.5 presents a method for learning a causal network. All of these methods must deal both with the limited size of the data sets and the fact that this data tends be very noisy. Gathering more experimental data, however, can be expensive and time consuming. Therefore, Section 2.6 briefly describes how experiments can be designed to improve learning results.