ABSTRACT

Gatekeeper Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387 14.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390

Regulatory guidelines for drug development suggest a strong control of the familywise error rate (FWER), whenmultiple hypotheses are simultaneously tested in confirmatory clinical trials (ICH, 1998; CHMP, 2002). That is, the probability to erroneously reject at least one true null hypothesis is controlled at a prespecified significance level α∈ (0, 1) under any configuration of true and false null hypotheses. A variety of multiple test procedures exist that control the FWER at the designated level α and the underlying theory is well developed (Dmitrienko et al., 2009; Bretz et al., 2010; Westfall et al., 2011). However, confirmatory studies are becoming increasingly more complex and often involve multiple statistical hypotheses that reflect structured clinical study objectives. Typical examples include the simultaneous investigation of multiple doses or regimens of a new treatment, two or more clinical endpoints, several populations, noninferiority and superiority, or any combination thereof. Clinical teams are then faced with the difficult task of structuring these hypotheses to best reflect the clinical study’s objectives. This task comprises, but is not restricted to, the identification of the study’s primary objective(s), its secondary objective(s), a decision about whether only a single hypothesis is of paramount importance or several of them are equally relevant, the degree of controlling incorrect decisions, etc. In addition, pairs of primary and secondary objectives might be coupled and should thus be investigated hierarchically. For example, in a diabetes trial, a reduction in the patients’ body weight may only be of interest if a reduction in the glycated hemoglobin (HbA1c) level is achieved, but two different doses of a treatment are equally relevant contenders for a dosage recommendation of a specific drug. In this case, the hypothesis involving body weight reduction in the low dose is a descendant of its parent’ primary hypothesis (HbA1c level reduction in the low dose).