ABSTRACT
Luc Bijnens, Willem Talloen, Bie Verbist, Hinrich W.H. Go¨hlmann
Janssen Pharmaceutica, Belgium
Adetayo Kasim
Durham University, United Kingdom
QSTAR Consortium
16.1 Introduction: From a Single Trial to a High-Dimensional Setting 276
16.2 The QSTAR Framework and Surrogacy . . . . . . . . . . . . . . . . . . . . . . . . . 278
16.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
16.3.1 The ROS1 Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
16.3.2 The EGFR Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
16.4 Graphical Interpretation (I): Association between a Gene and
Bioactivity Accounting for the Effect of a Fingerprint Feature . 280
16.5 Modeling Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
16.5.1 The Joint Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
16.5.2 Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
16.5.3 Graphical Interpretation (II): Adjusted Association and
Conditional Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
16.6 Analysis of the EGFR and the ROS1 Projects . . . . . . . . . . . . . . . . . . 284
16.6.1 Application to the EGFR Project . . . . . . . . . . . . . . . . . . . . . . . 284
16.6.2 Application to the ROS1 Project . . . . . . . . . . . . . . . . . . . . . . . 287
16.7 The R Package IntegratedJM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
16.7.1 Identification of Biomarkers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
with
16.7.2 Analysis of One Gene Using the gls Function . . . . . . . . . . 303
16.8 The IntegratedJM Shiny App . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
16.9 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
In contrast with the analysis presented in previous chapters, which was focused on data obtained from clinical trials, this chapter focuses on drug discovery experiments. Our aim is to find genetic biomarkers for phenotypic data for a set of compounds under development. The data for the analysis consists of (1) a m × n gene expression matrix (X) that contains gene expression measurements of m genes for n compounds, (2) a n×1 vector of phenotypic data (Y), and (3) a n × 1 vector of chemical structure (Z). Figure 16.1 illustrates the relationship between the three variables. Our goal is to model the relationship between the gene expression and the phenotypic data, taking into account that the chemical structure of the compound may (or may not) influence both variables. This modeling approach is called QSTAR, Quantitative StructureTranscription-Assay Relationship, and it is further discussed in Section 16.2. The connection between the QSTAR framework and the surrogacy framework is illustrated in Section 16.4.