ABSTRACT

We have seen in the previous chapter that a major difficulty of the standard approach to GO overrepresentation analysis is that each term is analyzed in isolation. Because of the statistical dependencies between terms that are close to one another in the ontology graph, if one term is called significant then commonly one or more related terms are also called significant. A similar problem can affect terms that are distant from one another in the ontology but whose annotations are correlated. The parent-child and the topology algorithms were developed in the attempt to compensate these effects by means of more or less local adjustments to the statistical tests being performed for the GO terms. These procedures are able to reduce false positive results on simulated data, and tend to return smaller lists of terms on real datasets.