ABSTRACT

In this chapter, we switch from the estimation problem to the test problem, and we explain some possible ways to handle the impact of high-dimensionality in this context. More precisely, we will focus on the problem of performing simultaneously a large number of tests. This issue is of major importance in practice: Many scientific experiments seek to determine if a given factor has an impact on various quantities of interest. For example, we can seek for the possible side effects (headache, stomach pain, drowsiness, etc.) induced by a new drug. From a statistical perspective, this amounts to test simultaneously for each quantity of interest the hypothesis “the factor has no impact on this quantity” against “the factor has an impact on this quantity.” As we have seen in the Chapter 1, considering simultaneously many different tests induces a loss in our ability to discriminate between the two hypotheses. We present in this chapter the theoretical basis for reducing at best this deleterious effect. We start by illustrating the issue on a simple example, and then we introduce the bases of False Discovery Rate control.