ABSTRACT

Nolen Joy Perualila, Ziv Shkedy, Sepp Hochreiter and Djork-Arne Clevert

15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 15.2 Information Content of Biclusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225

15.2.1 Theoretical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 15.2.2 Application to Drug Discovery Data Using the

biclustRank R Package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226 15.3 Ranking of Biclusters Based on Their Chemical Structures . . . . . 229

15.3.1 Incorporating Information about Chemical Structures Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229

15.3.2 Similarity Scores Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 15.3.2.1 Heatmap of Similarity Scores . . . . . . . . . . . . . . . . 233

15.3.3 Profiles Plot of Genes and Heatmap of Chemical Structures for a Given Bicluster . . . . . . . . . . . . . . . . . . . . . . . . . 234

15.3.4 Loadings and Scores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 15.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

In Chapter 14, biclustering is presented as a gene module enrichment technique where we search for a bicluster containing the primary genes of interest. Ideally, we would like to examine all biclusters that were discovered by an algorithm. However, in many cases, a large number of biclusters are reported in a bicluster solution. This implies that a procedure to prioritize biclusters, irrespective of biclustering algorithm is needed. Clearly, one open question is how to determine which biclusters are most informative and rank them on the basis of their importance. In many studies, biclusters are empirically evaluated based on different statistical measures (Koyutu¨rk et al., 2004) or

Biclustering Solution

Data Matrix

...BC2BC1 BCK

Ranking

Ranking based

on information

content

Ranking based

on chemical

structures

Figure 15.1 Ranking of biclusters.