ABSTRACT

Sample size determination is one of the major goals during the planning stage of a clinical trial. Consider a two-sample comparison study with normal outcomes, and let δ and σ2 be the treatment difference and variance, respectively. Under the hypotheses

H0 : δ = 0 versus H1 : δ > 0, (12.1)

the sample size for a trial with the type I error rate α and power 1− β is n = 4σ2(zα + zβ)

2/δ2, (12.2)

where zα is the 100(1 − α)th percentile of the standard normal distribution. However, missing data often arise in clinical trials [14], which may have substantial impact on the statistical power of the trial. In particular, the proportions of missing data could be different for different treatment arms, which

leads to an unbalanced design even if the original trial design is based on 1 : 1 allocation. Generally, missing data can be classified into three missing mechanisms: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR); for a comprehensive coverage on missing data, see Little and Rubin [15]. In contrast to MAR and MNAR, MCAR is the strongest assumption but might not be realistic in practice. Valid statistical inference can be performed using the observed data under MCAR and MAR, while the trial design may lose power although the type I error rate is typically maintained. Under MNAR, the missing probability depends on the missing data itself, which makes it more difficult to tackle. No matter which missing data mechanism is encountered in a trial, existing methods for sample size calculation can hardly deal with the missingness and thus cannot maintain desired statistical accuracy and power. In the design stage of a clinical trial, a common practice to account for missing data is to inflate the sample size by 1/(1−pi), where 0 ≤ pi < 1 is the expected missing proportion [6]. Because pi is unknown in practice and such a simple inflation approach completely ignores the missing data patterns, the sample size could be over-or underestimated. For example, a clinical trial to study the effects of second-generation antipsychotic drugs had planned to enroll 254 patients by assuming a 75% follow-up rate. However, the actual follow-up rate turned out to be only 53% [10], and thus a simple inflation using 75% would lead to an inadequate sample size. As another example, Cobo et al. [3] reported a trial that added 102 new patients at an interim stage, but the difference in the missing probabilities of the genotypic and control arms was neglected, such that the sample size was unbalanced between the two arms.