ABSTRACT

The availability of GO and many cross-referencing databases seems to solve the problem of interpreting the results of a microarray experiment from a biological point of view. Most databases provide efficient search mechanisms that return quickly all annotation information associated to any specific gene or gene product of interest. However, the problem is that most relevant databases are oriented towards a manual, gene by gene querying. If the processing of the list of differentially regulated genes were to be done manually, one would take each accession number corresponding to a regulated gene, search various public databases and compile a list with, for instance, the biological processes that the gene is involved in. The same type of analysis could be carried out for other functional categories, such as biochemical function, cellular role, etc. This task can be performed repeatedly, for each gene, in order to construct a master list

processing of this list can provide a list of those biological processes that are common between several of the regulated genes. It is intuitive to expect that those biological processes that occur more frequently in this list would be more relevant to the condition studied. For instance, if 200 genes have been found to be differentially regulated and 160 of them are known to be involved in, let us say, mitosis, it is intuitive to conclude that mitosis is a biological process important in the given condition. As we shall see in the following example, this intuitive reasoning is incorrect and a more careful analysis must be done in order to identify the truly relevant biological processes.