ABSTRACT

As stated by Lewis and Machin (1993, p� 647), “In the clinical researcher’s perfect world, every subject entered into a randomized controlled clinical trial (RCT) would satisfy all entry criteria, would complete their allocated treatment as described in the protocol, and would contribute data records that were complete in all respects�” However, since clinical trials are dealing with human subjects, strict adherence to the trial protocol is impossible� Nonadherence to the trial protocol can be broadly classified into four categories related to four aspects of the trial: (1) inclusion/exclusion criteria, (2) study medication/treatment, (3) scheduled visit, and (4) efficacy measurement� These are explained further as follows:

1� Subjects are found to be ineligible (i�e�, do not meet the inclusion/ exclusion criteria) after randomization� One possible reason is due to misdiagnosis of the patient condition�

2� Subjects do not take study medication as scheduled or do not receive the specified amount of the treatment/intervention (noncompliance)� At the extreme, subjects (1) do not take any study medication or receive any treatment/intervention, (2) receive the alternative treatment, or (3) take prohibited concomitant medications� Possible reasons for noncompliance could be toxicity, dropouts, or withdrawal�

3� Subjects miss the visits or do not make the visits as scheduled� Subjects may withdraw or drop out due to death, relocation, or loss to follow-up�

4� No efficacy measurements are taken, primarily due to missed visits�

In clinical trials, there are different forms of medical intervention and different study endpoints, even within the same therapeutic area, let alone across different therapeutic areas� The most common intervention is drug products in the form of pill for many different diseases/sicknesses� Other interventions include (1) surgical procedures, which are particularly common

in cancer research; (2) transfusion with biological products (e�g�, blood and blood-related products); (3) medical devices, etc� The study endpoints include (1) cure of the disease or resolution of illness; (2) progression-free or longterm survival, which is particularly common in cancer research; (3) quality of life, etc� For a given RCT, depending upon the therapeutic areas (and/or different forms of medical intervention and different study endpoints), each of the four categories of nonadherence may or may not be applicable�

Data analysis in RCTs is complex, challenging, and controversial due to nonadherence to trial protocol, and it is evolving� The complexity arises because various aspects of nonadherence (see earlier) need to be addressed, and each therapeutic area (or each form of medical intervention and/or different study endpoints within a therapeutic area) may have its own unique issues� The concepts of (1) efficacy versus effectiveness (see Section 10�2�2) and (2)  the explanatory approach versus the pragmatic approach (see Section  10�2�3) add another layer of complexity� Nonadherence to trial protocol leads to two major issues in data analysis: missing data and noncompliance� The controversy is due to the intention-to-treat (ITT) principle (see Section 10�2�1)� There are two major aspects of data analysis: (1) subjects are analyzed as “randomized” (see Sections 10�3�2 and 10�6�1) or as “treated” (see Section 10�3�4), and (2) some randomized subjects may or may not be excluded from analysis�

The ITT principle (see Section 10�2�1) was controversial and prompted considerable debate for decades (see Section 10�4�1)� Following the ITT principle, all randomized subjects were included in an analysis as “randomized” (see Section 10�3�2), while a per-protocol (PP) analysis excluded protocol violators (see Section  10�3�3)� The controversy about whether to use an ITT or a PP approach in superiority trials largely subsided in the 1990s in favor of support for ITT (see Section 10�4�1)� On the other hand, there is no clear “winner” for NI trials (see Section 10�5)� Current thinking from regulatory agencies is that both analyses are of equal importance and should lead to similar conclusions for a robust interpretation (see Section 10�5�1)� However, using primarily ITT analysis in NI trials is gaining support in recent publications (e�g�, Fleming et al� 2011; Schumi and Wittes 2011) (see Section 10�5�5)� Implementation of an ITT analysis in RCTs is discussed in Section 10�6�

10.2.1 Intention-to-Treat Principle

The phrase “intention to treat” (ITT) was originated by Sir Austin Bradford Hill, and the term “intent-to-treat” is commonly used (Rothmann, Wiens, and Chan 2011)� Bradford-Hill recommended that all participants be included “in the comparison and thus measure the intention to treat in a given way rather than the actual treatment” (1961, 258) as he noted that post-randomization exclusion of participants could affect the internal validity that randomization sought to

achieve� As stated by Polit and Gillespie (2010, p� 357), “His advice was primarily aimed at researchers who deliberately removed subjects from the analysis�”

Cochrane Collaboration (2002, p� 5) reiterated the ITT principle as follows:

The basic intention-to-treat principle is that participants in trials should be analyzed in the groups to which they were randomized, regardless of whether they received or adhered to the allocated intervention�

A glossary in ICH E9 (1998, p� 42) included “intention-to-treat principle” as follows:

The principle asserts that the effect of a treatment policy can be best assessed by evaluating on the basis of the intention to treat a subject (i�e�, the planned treatment regimen) rather than the actual treatment given� It has the consequence that subjects allocated to a treatment group should be followed up, assessed, and analyzed as members of that group irrespective of their compliance with the planned course of treatment�

Gillings and Koch (1991, p� 411) elaborated the ITT principle as follows:

The fundamental idea behind ITT is that exclusion from the statistical analysis of some patients who were randomized to treatment may induce bias which favors one treatment group more than another� Bias may also occur if patients are not analyzed as though they belonged to the treatment group originally intended by the randomization procedure rather than the treatment actually received�

Deviations from the ITT principle, such as (1) excluding randomized subjects, as in PP analysis (see Section 10�3�3), and/or (2) transferring subjects from one group to another group, as in as-treated analysis (see Section 10�3�4), would destroy the comparability of the treatment group that is achieved through randomization (Newell 1992)�

10.2.2 Efficacy versus Effectiveness

Roland and Torgerson (1998, p� 285) define (1) efficacy as the “benefit a treatment produces under ideal conditions, often using carefully defined subjects in a research clinic,” and (2) effectiveness as “the benefit the treatment produces in routine clinical practice�” Hernan and Hernandez-Diaz (2012, p� 50) define efficacy as “how well a treatment works under perfect adherence and highly controlled conditions,” and effectiveness as “how well a treatment works in everyday practice�“ Additionally, “Effectiveness takes into consideration how easy a drug is to use, and potential side effects, whereas efficacy measures only how well it produces the desired result” (Neal, 2009)� Thaul (2012, p� 4) elaborates the differences between efficacy and effectiveness in the following:

Efficacy refers to whether a drug demonstrates a health benefit over a placebo or other intervention when tested in an ideal situation, such as a tightly

controlled clinical trial� Effectiveness describes how the drug works in a realworld situation� Effectiveness is often lower than efficacy because of interactions with other medications or health conditions of the patient, sufficient dose or duration of use not prescribed by the physician or followed by the patient, or use for an off-label condition that had not been tested�

Roland and Torgerson (1998) introduce two different types of clinical trials: (1) explanatory trials, which measure efficacy, and (2) pragmatic trials, which measure effectiveness� Section 10�2�3 discusses the different aspects of these two types of trials�

10.2.3 Explanatory Approach versus Pragmatic Approach

Roland and Torgerson (1998, p� 285) describe differences between the explanatory and pragmatic approaches in different aspects of the trials, such as patient population, study endpoint and analysis set, as follows:

• Patient population� An explanatory approach recruits as homogeneous a population as possible and aims primarily to further scientific knowledge� By contrast, the design of a pragmatic trial reflects variations between patients that occur in real clinical practice and aims to inform choices between treatments� To ensure generalizability, pragmatic trials should, so far as possible, represent the patients to whom the treatment will be applied� The need for purchasers and providers of health care to use evidence from trials in policy decisions has increased the focus on pragmatic trials�

• Study endpoint� In explanatory trials, intermediate outcomes are often used, which may relate to understanding the biological basis of the response to the treatment-for example, a reduction in blood pressure� In pragmatic trials, outcomes should represent the full range of health gains-for example, a reduction in stroke and improvement in quality of life�

• Analysis set� In a pragmatic trial, it is neither necessary nor always desirable for all subjects to complete the trial in the group to which they were allocated� However, patients are always analyzed in the group to which they were initially randomized (intention-to-treat analysis), even if they drop out of the study or change groups�

Schwartz and Lellouch (1967) describe the differences in study design between the two approaches and “consider a trial of anticancer treatments in which radiotherapy alone is to be compared with radiotherapy preceded by the administration of a drug that has no effect by itself but that may sensitize the patient to the effects of radiation�” The drug is assumed to be administered over a 30-day period� The “radiotherapy alone” group may then be handled in two different ways (see Figure 10�1), as described by Schwartz and Lellouch (1967, p� 638):

1� Radiotherapy may be preceded by a blank period of 30 days so that it is instituted at the same time in each group�

2� Radiotherapy may be instituted at once, thereby carrying it out at what is most probably the optimal time�

In design 1 (delayed radiotherapy in both groups), the two groups are alike from the radiotherapy point of view and differ solely in the presence or absence of the drug� Therefore, it provides an assessment of the sensitizing effect of the drug and gives valuable information at a biological level� This is the explanatory approach� It provides an answer to the research question whether the drug has a sensitizing effect� However, it does not provide an answer to a practical question whether the combined treatment is better than immediate radiotherapy�

In design 2 (immediate radiotherapy in one group), the two treatments are compared under the conditions in which they would be applied in practice� This is the pragmatic approach� It provides an answer to a practical question whether the combined treatment is better than immediate radiotherapy� However, it will provide information on the effectiveness of the drug only when the combined treatment proves to be better than radiotherapy alone�

10.3.1 A Simplified Schema for a Randomized Control Trial

Referring to Figure 10�2, Newell (1992, pp� 838-839) described a simplified schema for a randomized control trial as follows:

For any RCT of a health care intervention (e�g�, formal rehabilitation compared with none, day-case surgery compared with inpatient care, two different recruitment methods for mammography), the broad research outline is as shown in [Figure 10�2]��� by the end of the trial, there were four groups of patients: (l) those allocated to A who did not complete A, (2) those allocated to A who did complete it, (3) those allocated to B who completed it, and (4) those

allocated to B who did not complete it� ITT analysis (otherwise known as “pragmatic trial” or “[program] effectiveness analysis”) compares l + 2 with 3 + 4� Efficacy analysis (otherwise known as “explanatory trial” or “test of biological efficacy”) compares 2 with 3, ignoring l and 4� Treatment-received analysis (otherwise known as “as treated”) compares 1 + 3 with 2 + 4 when treatments are switched�

These three types of analyses are further elaborated in the next three subsections�

10.3.2 Intention-to-Treat Analyses

An ITT analysis is also known as a program effectiveness analysis (see Sections 10�2�2 and 10�3�1)� It includes all randomized subjects in the treatment groups to which they were randomized, regardless of whether they received or adhered to the assigned treatment� Such an analysis preserves comparable treatment groups due to randomization and prevents bias resulting from post-randomization exclusions� Newell (1992, p� 837) provides the rationale for the ITT analysis in a different way, as follows:

The purpose of randomization is to avoid selection bias and to generate groups which are comparable to each other� Any changes to these groups by removing some individuals’ records or transferring them to another group destroy that comparability�

Comparability of treatment groups is the foundation on which the statistical inference is built� Validity of the statistical inference will be compromised if such comparability is destroyed�

In theory, there is a consensus in the literature regarding the definition of an ITT analysis, as illustrated in the following:

• “In 1990, a work group for the Biopharmaceutical Section of the American Statistical Association (ASA) came to the conclusion that it is one which includes all randomized patients in the groups to which they were randomly assigned, regardless of their compliance with the entry criteria, regardless of the treatment they actually received, and regardless of subsequent withdrawal from treatment or deviation from the protocol (Fisher et al� 1990)” (Lewis and Machin 1993, p� 647)�

• “In an intent-to-treat analysis (ITT), patients are analyzed according to the treatment to which they were assigned, regardless of whether they received the assigned treatment” (D’Agostino, Massaro, and Sullivan 2003, p� 182)�

• “The ITT analysis includes all randomized patients in the groups to which they were randomly assigned, regardless of their compliance with the entry criteria, the treatment they actually received, and subsequent withdrawal from treatment or deviation from the protocol” (Le Henanff et al� 2006, p� 1148)�

• “In an ITT analysis, subjects are analyzed according to their assigned treatment regardless of whether they actually complied with the treatment regimen” (Sheng and Kim 2006, p� 1148)�

• “Intention-to-treat (ITT) is an approach to the analysis of randomized controlled trials (RCT) in which patients are analyzed as randomized regardless of the treatment actually received” (Gravel, Opartny, and Shapiro, 2007, p� 350)�

• “In the ITT approach, all patients we intended to treat will be included into the analysis, whether they completed the trial following the protocol or not” (Gonzalez, Bolaños, and de Sereday 2009)�

• “According to the principle, trial participants should be analyzed within the study group to which they were originally allocated irrespective of non-compliance or deviations from protocol” (Alshurafa et al� 2012)�

• “An ITT analysis includes all participants according to the treatment to which they have been randomized, even if they do not receive the treatment� Protocol violators, patients who miss one or more visits, patients who drop out, and patients who were randomized into the wrong group are analyzed according to the planned treatment” (Treadwell et al� 2012, pp� 7-8)�

In practice, however, a strict ITT analysis is often hard to achieve for two main reasons: missing outcomes for some participants and nonadherence to

the trial protocol (Moher et al� 2010)� Different definitions of modified ITT analysis emerge due to nonadherence to the trial protocol by excluding some randomized patients, as given in the following:

• “…In a further 25 RCTs in 1997-1998, a modified intent-to-treat analysis was performed, which excluded participants who never received treatment or who were never evaluated while receiving treatment” (Hill, LaValley, and Felson 2002, p� 783)�

• “Some definitions of ITT exclude patients who never received treatment” (D’Agostino, Massaro, and Sullivan 2003, p� 182)�

• “A true or classic ITT is one that removes none of the subjects from the final analysis-with the exception of ineligible subjects removed post-randomization…” (Polit and Gillespie 2010, p� 357)�

• “In this article, modified ITT refers to an approach in which all participants are included in the groups to which they were randomized, and the researchers make efforts to obtain outcome data for all participants, even if they did not complete the full intervention (Gravel et al� 2007; Polit and Gillespie 2009)” (Polit and Gillespie 2010, pp� 357-358)�

To include all randomized patients in an ITT analysis, imputation methods are often used to deal with missing data� However, such methods require an untestable assumption of “missing at random” to some degree� The best way to deal with the problem is to have as little missing data as possible (Lachin 2000; NRC 2010)� Although the ITT has been widely used as the primary analysis in superiority trials, it is often inadequately described and inadequately applied (Hollis and Campbell 1999)� For example, studies that claimed use of ITT did not indicate how missing outcomes or deviations from protocols were handled�

10.3.3 Per-Protocol Analyses

In contrast to an ITT analysis, a per-protocol (PP) analysis excludes protocol violators and includes only participants who adhere to the protocol as defined by some authors in the following:

• “One analysis which is often contrasted with the ITT analysis is the ‘per protocol’ analysis� Such an analysis includes only those patients who satisfy the entry criteria of the trial and who adhere to the protocol subsequently (here again there is ample room for different interpretations of what constitutes adherence to the protocol)” (Lewis and Machin 1993, p� 648)�

• “The per-protocol analysis includes only patients who satisfied the entry criteria of the trial and who completed the treatment as defined in the protocol” (Le Henanff et al� 2006, p� 1184)�

• “In a PP approach, however, only subjects who completely adhered to the treatment are included in the analysis” (Sheng and Kim 2006, p� 1184)�

• “… is preferred to per-protocol (PP) analysis (i�e�, using outcomes from only those participants who fully complied with the study protocol)” (Scott 2009, p� 329)�

• “… the per-protocol (PP) population, which in this case is the set of people who have taken their assigned treatment and adhered to it” (Schumi and Wittes 2011)�

On the other hand, other authors exclude major protocol violators from the PP analyses and allow minor protocol violators, as defined in the following:

• “The per-protocol (PP) analysis includes all patients who completed the full course of assigned treatment and who had no major protocol violations” (D’Agostino, Massaro, and Sullivan 2003, p� 182)�

• “The PP population is defined as a subset of the ITT population who completed the study without any major protocol violations” (Sanchez and Chen 2006, p� 1171)�

• “In general, the PP analysis, as described by ICH E-9 guidance, includes all subjects who were, in retrospect, eligible for enrollment in the study without major protocol violations, who received an acceptable amount of test treatment, and who had some minimal amount of follow-up” (Wiens and Zhao 2007, p� 287)�

Excluding major protocol violators and allowing minor protocol violations opens the door for subjective judgments as to (1) what constitutes minor or major protocol violations, (2) what constitutes an acceptable amount of test treatment, and (3) what constitutes a minimal amount of follow-up�

Presumably, subjects who were switched to the other treatment arm (or who were wrongly randomized) are considered major protocol violators� Therefore, such subjects would be excluded from the PP analysis, although it is not explicitly stated so in the aforementioned definitions, with the exception that is defined by Newell (1992) in Section 10�3�1 where the term “efficacy analysis” is used instead of “PP analysis�”

PP (or efficacy) analysis is also known as test of biological efficacy (see Sections 10�2�2 and 10�3�1) or on-treatment analysis (Heritier, Gebski, and Keech 2003; Piaggio et al� 2006; Kaul and Diamond 2007)�

10.3.4 As-Treated Analyses

One key feature in ITT analysis is that subjects are analyzed according to their assigned treatment (i�e�, as-randomized)� This is in contrast with “as-treated” (AT) analysis where subjects are analyzed according to the

treatment received, regardless of the regimen to which they were assigned, including subjects who do not complete the trial and those who switch from one treatment to another (Wertz 1995; Wiens and Zhao 2007)� It is also known as a “treatment-received analysis” and a “garbage analysis” (Newell 1992; Wertz 1995)�

10.3.5 Analysis Sets

From a practical point of view, a strict ITT analysis, including all randomized subjects, may not be warranted� In fact, ICH E9 (1998, p� 41) defines full analysis set as “the set of subjects that is as close as possible to the ideal implied by the intention-to-treat principle� It is derived from the set of all randomized subjects by minimal and justified elimination of subjects�” The document presents three circumstances that might lead to excluding randomized subjects from the full analysis set: (1) the failure to satisfy major entry criteria (eligibility violations), (2) the failure to take at least one dose of trial medication, and (3) the lack of any data post-randomization, and states that such exclusions should always be justified� Peto et al� (1976) allow some inappropriately randomized patients to be excluded (see Section 10�6)�

ICH E9 (1998, p� 41) defines per-protocol set (sometimes described as the valid cases, the efficacy sample, or the evaluable subjects sample) as

the set of data generated by the subset of subjects who complied with the protocol sufficiently to ensure that these data would be likely to exhibit the effects of treatment according to the underlying scientific model� Compliance covers such considerations as exposure to treatment, availability of measurements, and absence of major protocol violations�

10.4.1 Controversy of the ITT Principle

In late 1980s, ITT analysis (see Section 10�3�2) was endorsed by regulatory agencies (U�S� FDA 1988; NCM 1989) as the primary analysis of RCT data� It is also recommended by the American Statistical Associations Group (Fisher et al� 1990) and the Cochrane Collaboration (Moher, Schulz, and Altman 2001)� Note that such recommendation is in the context of placebo-controlled trials as opposed to active-control trials (more specifically, NI trials)�

The goal of the ITT principle in RCTs is to preserve the prognostic balance between participants in treatment and control groups achieved through randomization and to thereby minimize selection bias and confounding (Alshurafa et al� 2012) (see also Sections 10�2�1 and 10�3�2)� However, the ITT principle was controversial and prompted considerable debate for decades�

See, for example, Lachin (2000) and the references therein� The controversy is summarized by Polit and Gillespie (2010, p� 357) as follows:

Sir Bradford-Hill’s recommendation was controversial and instigated considerable debate (Lachin 2000)� Opponents advocated removing subjects who did not receive the treatment, arguing that such a per-protocol analysis would test the true efficacy of the intervention� Opponents maintained that it is not sensible to include in the intervention group people who did not actually receive the intervention� This position was most sharply expressed within the context of pharmaceutical trials, where estimates of effectiveness could be construed as the degree of beneficial effects among those who were compliant and able to tolerate the drug� Those advocating an ITT analysis, on the other hand, insisted that per-protocol analyses would likely lead to biased estimates of effectiveness because removal of noncompliant patients undermined the balance that randomization was presumed to have created (the methodological argument)� Moreover, they argued that an ITT approach yields more realistic estimates of average treatment effects, inasmuch as patients in the real world drop out of treatment or fail to comply with a regimen (the clinical and policy argument)�

One argument for ITT analysis in dealing with noncompliance in superiority trials is that noncompliance would lower the apparent impact of effective interventions and hence, would provide a conservative estimate of the treatment effect (Alshurafa et al� 2012)� Although such an argument in superiority trials is legitimate, it is problematic in NI trials because a conservative estimate of the treatment effect may actually increase the likelihood of falsely concluding noninferiority, that is, inflation of the Type I error rate (see Section 10�5)� Another argument for an ITT analysis, independent of study design (superiority or NI), is that the purpose of the analysis is to estimate the effects of allocating an intervention in practice, not the effects in the subgroup of participants who adhere to it (Cochrane Collaboration 2002)� In other words, the interest is to assess effectiveness rather than efficacy (see Section 10�2�2)�

10.4.2 Recommendations for ITT

As stated by Lewis and Machin (1993, p� 648), “��� ITT is better regarded as a complete trial strategy for design, conduct, and analysis rather than as an approach to analysis alone�” In this regard, Hollis and Campbell (1999, p� 674) recommend the following:

• Design • Decide whether the aim is pragmatic or explanatory� For prag-

matic trials, ITT is essential� • Justify in advance any inclusion criteria that, when violated,

would merit exclusion from ITT analysis�

• Conduct • Minimize missing response on the primary outcome� • Follow up with subjects who withdraw from treatment�

• Analysis • Include all randomized subjects in the groups to which they

were allocated� • Investigate the potential effect of missing response�

• Reporting • Specify that ITT analysis has been carried out, explicitly describ-

ing the handling of deviations from randomized allocation and missing response�

• Report deviations from randomized allocation and missing response� • Discuss the potential effect of missing response� • Base conclusions on the results of ITT analysis�

See Schulz, Altman, and Moher (2010) for the CONSORT guidelines for reporting parallel-group randomized trials�

10.4.3 Discussion

There are two strong arguments in favor of ITT analysis over PP analysis of superiority trials: (1) the ITT analysis preserves randomization, while exclusion of noncompliant subjects in the PP analysis could introduce bias that randomization intends to avoid; and (2) the ITT analysis including noncompliant subjects is more reflective of “real-world” practice, while the PP analysis is not, as it excludes noncompliant subjects� With regard to the latter argument, the ITT analysis including noncompliant subjects will result in conservative estimates of efficacy, which is not the case in the estimation of effectiveness by definition (see Section 10�2�2)�

A strict ITT analysis includes all randomized subjects with the “once randomized always analyzed” philosophy (Schulz and Grimes 2002) as “randomized,” while a strict PP analysis includes only subjects who strictly adhere to the study protocol� Therefore, these two analyses are at the two extremes in terms of the total sample size, which is not necessarily true in terms of bias� Assessing bias in estimating the efficacy due to noncompliance is complex and is beyond the scope of this book� More importantly, from regulatory and practical points of view, adjustment for noncompliance is not warranted, as the interest is in effectiveness rather than efficacy (see Section 10�2�2)� Readers are referred to an issue of Statistics in Medicine for the analysis of compliance (volume 17, number 3, 1998)�

Various modified ITT analyses have been used in practice (see Section 10�3�2) that allow some randomized subjects to be excluded� On the other

hand, some PP analyses include minor protocol violators (see Section 10�3�3)� For example, Gillings and Koch (1991) define (1) the ITT population as all randomized patients who were known to take at least one dose of treatment and who provided any follow-up data for one or more key efficacy variables, and (2) the efficacy analyzable population as a subset of the ITT population who adhered to the critical aspects of the protocol� Strict adherence to all protocol features is needed for a second version of the efficacy analyzable population� The authors recommend that patients be analyzed according to the treatments actually received when only a few patients are wrongly randomized and such administrative errors are not associated with the background characteristics of these patients or their prognosis�

10.5.1 Current Thinking from Regulatory Agencies

Although ITT analysis is widely accepted as the primary analysis in superiority trials (see Sections 10�1 and 10�4), there is concern that inclusion of noncompliant subjects in such an analysis in NI (or equivalence) trials would dilute the potential treatment difference, leading to an erroneous conclusion of NI, and therefore, it is anticonservative (see Section 10�5�2)� On the other hand, inclusion of only compliant subjects (i�e�, exclusion of noncompliant subjects) in a PP analysis would reflect the treatment differences that may exist; however, such exclusion could undermine the prognostic balance between the two treatment arms achieved through randomization, leading to potential biases (see Section 10�3�2)� Thus, for NI trials, there is no single ideal analysis strategy in the face of substantial noncompliance or missing data, and both analysis by ITT and well-defined PP analyses would seem warranted (Pocock 2003)�

The European Agency for the Evaluation of Medicinal Products, Committee for Proprietary Medicinal Products (EMEA/CPMP) guidance (2000) states that in an NI trial, the full analysis set and the PP analysis set have equal importance and their use should lead to similar conclusions for a robust interpretation� It should be noted that the need to exclude a substantial proportion of the ITT population from the PP analysis throws some doubt on the overall validity of the study (CPMP 1995)� The Food and Drug Administration (FDA) draft guidance (2010) states the following, noting that “as-treated” analysis is not defined in the document (see Section 10�3�4 for this definition):

In [NI] trials, many kinds of problems fatal to a superiority trial, such as nonadherence, misclassification of the primary endpoint, or measurement

problems more generally (i�e�, “noise”), or many dropouts who must be assessed as part of the treated group, can bias toward no treatment difference (success) and undermine the validity of the trial, creating apparent [NI] where it did not really exist� Although an “ as-treated” analysis is therefore often suggested as the primary analysis for NI studies, there are also significant concerns with the possibility of informative censoring in an as-treated analysis� It is therefore important to conduct both ITT and as-treated analyses in NI studies�

10.5.2 Anticonservatism of ITT Analysis in Noninferiority Trials

Although ITT analysis is widely accepted as the primary analysis in superiority trials, as discussed in Section 10�4, it was recognized in the 1990s that ITT analysis plays a different role in NI (or equivalence) trials because it is anticonservative in such cases� For example, Lewis and Machin (1993) stated the following:

… These are often called equivalence trials� … In such a trial, an ITT analysis generally increases the chance of erroneously concluding that no difference exists� When we are comparing an active agent with placebo, this increased risk is acceptable and is deliberately incurred� In these trials, ITT is conservative; we only declare a new agent effective when we have incontrovertible evidence that this is so, and the inevitable dilution of the treatment effect in an ITT analysis makes it harder to achieve this goal and affords extra statistical protection for the cautious� But when we are seeking equivalence, the bias is in an anticonservative direction�

And ICH E9 (1998) stated the following with some edits in the first sentence:

The full analysis set and the per-protocol set [see Section 10�3�5] play different roles in superiority trials and in equivalence or [NI] trials� In superiority trials, the full analysis set is used in the primary analysis (apart from exceptional circumstances) because it tends to avoid overoptimistic estimates of efficacy resulting from a per-protocol analysis� This is because the noncompliers included in the full analysis set will generally diminish the estimated treatment effect� However, in an equivalence or [NI] trial, use of the full analysis set is generally not conservative and its role should be considered very carefully�

The ITT analysis could be anticonservative in poorly conducted NI trials� For example, mixing up treatment assignments would bias toward similarity of the two treatments when the test treatment is not effective or is less effective than the active control (Ng 2001) (see Sections 1�5�3 in Chapter 1 and 2�5�1 in Chapter 2)� Schumi and Wittes (2011) elaborated on such a hypothetical example in the following:

Consider a trial with a hopelessly flawed randomization, where instead of creating two distinct treatment groups (one set of subjects receiving

the new treatment and the other the active comparator), the randomization scheme actually created two “blended” groups, each composed of half [the] subjects receiving the new treatment and half receiving the active comparator� If this trial were testing for superiority, the test would, with high probability, correctly find no difference between the groups� As [an NI] trial, however, such a flawed trial would be very likely to incorrectly demonstrate [NI]�

Lewis and Machin (1993) gave an extreme hypothetical example where all patients were withdrawn from both randomized arms and put on the same standard therapy; an ITT analysis would conclude that no difference existed between the original treatments, regardless of their true relative efficacy� In another extreme hypothetical example given by Brittain and Lin (2005) where no subject complies with therapy, NI could be “demonstrated” between any two therapies with an ITT analysis� In fact, there is consensus regarding the role of ITT in NI trials, as seen by many authors:

1� “In comparative studies [which seek to show one drug to be superior] the ITT analysis usually tends to avoid the optimistic estimate of efficacy which may result from a PP analysis, since the noncompliers included in an ITT analysis will generally diminish the overall treatment effect� However, in an equivalence trial, ITT no longer provides a conservative strategy and its role should be considered very carefully” (CPMP 1995, p� 1674)�

2� “For equivalence trials, however, there is concern that an ITT analysis will move the estimated treatment difference towards zero since it will include patients who should not have been in the trial who will get no benefit, or patients who did not get the true treatment benefit because of protocol violation or failure to complete� PP analyses include only those who follow the protocol adequately� This would be expected to detect a clearer effect of treatment since uninformative ‘noise’ would be removed” (Ebbutt and Frith 1998, p� 1699)�

3� “In [NI] trials, the ITT analysis tends to be ‘liberal�’ That is, by inclusion of those who do not complete the full course of the treatments, the ITT tends to bias towards making the two treatments (new treatment and active control) look similar� The PP removes these patients and is more likely to reflect differences between the two treatments” (D’Agostino, Massaro, and Sullivan 2003, p� 182)�

4� “The dilemma for [NI] trials is that faced with non-negligible noncompliance analysis by [ITT] could artificially enhance the claim of [NI] by diluting some real treatment difference” (Pocock 2003, p� 489)�

5� “[ITT] analyses will generally be biased toward finding no difference, which is usually the desired outcome in [NI] and equivalence trials and is favored by studies with many dropouts and missing data” (Gøtzsche 2006, p� 1173)�

6� “However, in the [NI] setting, because the null and alternative hypotheses are reversed, a dilution of the treatment effect actually favors the alternative hypothesis, making it more likely that true inferiority is masked� An alternative approach is to use a [PP] population, defined as only participants who comply with the protocol” (Greene et al� 2008, p� 473)�

7� “For [an NI] trial, the ITT analysis does not have a conservative effect� Dropouts and a poor conduct of the study might direct the results of the two arms toward each other� Another possibility is to consider the PP population, which consists of only the nonprotocol violators” (Lesaffre 2008, p� 154)�

8� “In an [NI] trial, ITT analysis is thus more likely to narrow the difference between treatments and yield a noninferior result� Consequently, a PP analysis is needed to cross-validate the ITT analysis, while bearing in mind substantial variation between treatment groups in rates and reasons for dropout may also invalidate PP analyses” (Scott 2009, p� 329)�

9� “Use of an ITT analysis in these trials could lead to a false conclusion of EQ-NI by diluting any real treatment differences” (Treadwell et al� 2012, p� 8)�

10.5.3 Role of Per-Protocol Analyses in Noninferiority Trials

Since an ITT analysis could be anticonservative in poorly conducted NI trials, as discussed in Section 10�5�2, PP analysis provides an alternative that would reflect the treatment difference that may exist by excluding noncompliant subjects� See, for example, items 2, 3, 6, 7, and 8 in Section 10�5�2� For this reason, use of the PP population as the primary analysis in NI trials became prominent in the early 2000s (Garrett 2003)�

Many authors (e�g�, Ebbutt and Frith 1998; Gøtzsche 2006) recognize that excluding noncompliant subjects in the PP analysis could undermine the prognostic balance between the two treatment arms achieved through randomization, leading to potential biases� This concern is the key reason why ITT analysis rather than PP analysis is widely accepted as the primary analysis for superiority trials (see Section 10�4�1)� This leads to the regulatory agencies and many authors (e�g�, Lewis and Machin 1993; Pocock 2003; Pater 2004; Kaul and Diamond 2007; Lesaffre 2008; and Scott, 2009) recommending that both ITT and PP analyses be performed in NI trials (see Section 10�5�1)� For example, Lewis and Machin (1993, p� 649) stated the following:

However, careful consideration of the alternatives does not lead to totally abandoning the ITT analysis� Indeed, in equivalence trials, the overall ITT strategy of collecting all endpoints in all randomized patients is equally valuable, but the role of the ITT analysis itself differs� Essentially, the ITT

analysis and one or more plausible [PP] analyses need to provide the same conclusion of no difference before a safe overall conclusion can be drawn�

10.5.4 Examples of Comparisons between Intent-to-Treat and Per-Protocol Analyses in Noninferiority and Equivalence Trials

Ebbutt and Frith (1998) compared the ITT and PP results in 11 asthma equivalence trials where peak expiratory flow rate was used as the efficacy endpoint� The PP analysis consistently gave wider confidence intervals than the ITT analysis due entirely to a smaller number of subjects included in the PP analysis� There was no evidence that the ITT analyses were more conservative in their estimates of treatment difference� The authors suggested that the relative importance of the two analyses will depend on the definitions used in particular therapeutic areas and recommended seeking prior agreement with regulatory agencies on the role of ITT and PP populations�

Brittain and Lin (2005) compared the ITT and PP analyses from 20 antibiotic trials that were presented at the FDA Anti-Infective Drug Products Advisory Committee from October 1999 through January 2003� They saw no indication that the PP analysis tends to produce a larger absolute treatment effect than the ITT analysis; on the contrary, the data hints at the possibility that there may be a tendency for the ITT analysis to produce a larger observed treatment effect than the PP analysis� They speculated that in typical studies of antibiotic therapy, both analyses may often underestimate the “pure” treatment difference; that is, the effect in the set of patients who comply with therapy and have no other complicating factors� Note that “pure” treatment difference refers to efficacy as opposed to effectiveness (see Section 10�2�2)�

In a systematic review, Wangge et al� (2010) included 227 articles on NI trials registered in PubMed on February 5, 2009� These articles referred to 232 trials� In 97 (41�8%) of the trials, both ITT and PP analyses were performed� They did not observe any evidence that the ITT analysis will lead to more NI conclusions than the PP analysis and concluded that both analyses are equally important, as each approach brings a different interpretation for the drug in daily practice�

10.5.5 Intention-to-Treat as the Primary Analysis in Noninferiority Trials

Due to concerns with ITT and PP analyses in NI trials (see Sections 10�5�2 and 10�5�3, respectively), current thinking from regulatory agencies does not put more emphasis on one over the other (see Section 10�5�1)� However, there is a need for one primary analysis in NI trials from the industry point of view, as expressed by some authors in the following:

• “The need to demonstrate equivalence in both an ITT and a PP analysis in a regulatory trial increases the regulatory burden on drug developers� The relative importance of the two analyses will

depend on the definitions used in particular therapeutic areas� Demonstrating equivalence in one population with strong support from the other would be preferred from the industry viewpoint” (Ebbutt and Frith 1998, p� 1691)�

• “The suggestion to use both ITT and PP analyses to evaluate activecontrol studies places extra burdens on sponsors to meet regulatory objectives� Making one analysis primary is preferable for efficient drug development and consistent regulatory decision making” (Wiens and Zhao 2007, p� 291)�

Wiens and Zhao (2007) advocated the use of ITT as the primary analysis in NI trials� They noted that the reasons for using ITT as the primary analysis in superiority trials, such as (1) preserving the value of randomization and (2) estimating real-world effectiveness (see Sections 10�2�2 and 10�4�1) are also applicable in NI trials� Other reasons are as follows:

• In order to support the constancy assumption, the ITT analysis should be preferred for the NI study because the active-control effect used in the NI margin determination is likely to be estimated based on an ITT analysis�

• Inference in switching from NI to superiority will be much simpler if the same analysis set is used�

• A major change in philosophy for NI testing with a very small NI margin is not rational (Hauck and Anderson 1999; Wiens and Zhao 2007)�

On the other hand, the authors also discussed reasons for not using ITT analyses and considered three general areas in which such analyses are potentially biased: enrolling ineligible subjects, poor study conduct, and missing data� They elaborated on how to address/mitigate these potential biases� Fleming (2008, p� 328) favored the ITT analysis over the PP analysis even in NI trials, and stated the following:

Even in NI trials, ITT analyses have an advantage over [PP] analyses that are based on excluding from analysis the outcome information generated in patients with irregularities, since [PP] analyses fail to maintain the integrity of randomization�

He recognized that nonadherence, withdrawals from therapy, missing data, violations in entry criteria, or other deviations in the protocol often tend to reduce sensitivity to differences between regimens in an ITT analysis� Such protocol deviations are of concern, especially in NI trials, and therefore, it is important to design and conduct trials in a manner that reduces the occurrence of protocol deviations (Fleming 2008)� The need to minimize protocol deviations is also expressed by Pocock (2003)�

More recent articles promote the role of ITT in NI trials to the primary analysis while demoting the role of PP to supportive or secondary analysis� For example, Fleming et al� (2011, p� 436) stated the following:

Only “as randomized” analyses, where all randomized patients are followed to their outcome or to the end of study (i�e�, to the planned duration of maximum follow-up or the analysis cutoff date), preserve the integrity of the randomization and, due to their unconditional nature, address the questions of most important scientific relevance� Therefore, the preferred approach to enhancing the integrity and interpretability of the [NI] trial should be to establish performance standards for measures of quality of trial conduct (e�g�, targets for enrollment and eligibility rates, event rate, adherence and retention rates, cross-in rates, and currentness of data capture) when designing the trial, and then to provide careful oversight during the trial to ensure these standards are met, with the “as randomized” analysis being primary and with analyses such as [PP] or analyses based on more sophisticated statistical models accounting for irregularities being supportive�

Another viewpoint in support of the ITT analysis is given by Schumi and Wittes (2011):

Appeal to the dangers of sloppiness is not a reason for using the PP population, but rather is a reason for ensuring that a trial is well designed and carefully monitored, with the primary analysis performed on an ITT population�

10.6.1 Introduction

Since a pure ITT analysis is practically impossible, a modified ITT analysis, although not consistently defined in the literature, is often used in the analysis of RCTs� In general, there are three aspects in the analysis of an RCT:

1� The participants are grouped according to the treatment to which they were randomized (as randomized) or to the treatment that they actually received (as treated)

2� Whether or not to exclude protocol violators 3� How to deal with missing data

The ITT principle defined in Section 10�2�1 should be followed in the analysis of RCTs� Any deviation from this principle should be justified in terms of bias in the treatment comparisons and other considerations� There

is no one-size-fit-all solution for a given deviation from the ITT principle� For example, subjects who were, in retrospect, ineligible for the study may or may not be excluded from the analysis, depending upon the situation (see Section 10�6�2)� Sections 10�6�3 and 10�6�4 present other examples where subjects may or may not be excluded from the analysis� Sections 10�6�5 and 10�6�6 discuss missing data and noncompliance with study medication, respectively�

10.6.2 Ineligible Subjects (Inclusion/Exclusion Criteria)

In general, a study population enrolled in an RCT is defined by the inclusion/ exclusion criteria� An explanatory approach recruits as homogeneous a population as possible (see Section 10�2�3)� This will reduce the “noise” and thus increase the likelihood of a successful trial� Subjects enrolled in an RCT are a “convenience” sample rather than a random sample from the study population� However, to make a statistical inference regarding the study population one has to “pretend” that the enrolled subjects are a random sample� On the other hand, the study population should not be too restricted so that the trial results could be extrapolated to a more general patient population� It should be noted that such an extrapolation has to rely on subjective clinical judgments rather than on statistical inferences�

The principle for excluding subjects from the analysis without biasing the results is given by Gillings and Koch (1991) in the following:

In general, there will be no bias for treatment comparisons if subjects are excluded based on information that could have been known prior to randomization and such exclusion is not, in any way, dependent on the treatment� It is, of course, important that there was no relationship between treatment to which the subject was randomized and the manner in which the eligibility violation was detected� For example, some patients on a new treatment under test may be reviewed more rigorously because of the scrutiny given to a new therapy� If such enhanced scrutiny led to the identification of clerical errors or other facts that in turn led to the patient violating protocol criteria, then the withdrawal of such a patient would be treatment related because only patients on test treatment received the enhanced scrutiny�

Subjects who do not meet the inclusion/exclusion criteria could be inappropriately randomized due to human error� In principle, excluding such ineligible subjects from the analysis will not lead to a biased estimate of a treatment difference� In fact, Peto et al� (1976) allow for some inappropriately randomized patients to be excluded� CPMP (1995, p� 1673) states that “Any objective entry criteria measured before randomization that will be used to exclude patients from analysis should be prespecified and justified-for example, those relating to the presence or absence of the disease under investigation�” An independent adjudication committee blinded to treatment and outcome must systematically review each subject, and the decision whether

to remove an ineligible subject should be based solely on information that reflects the subject’s status before randomization (Fergusson et al� 2002)� The ICH E9 guidance (1998, pp� 28-29) lists four requirements for excluding subjects from the ITT analysis:

1� The entry criterion was measured prior to randomization� 2� The detection of the relevant eligibility violations can be made com-

pletely objectively� 3� All subjects receive equal scrutiny for eligibility violations� (This

may be difficult to ensure in an open-label study, or even in a doubleblind study, if the data is unblinded prior to this scrutiny, emphasizing the importance of the blind review�)

4� All detected violations of the particular entry criterion are excluded�

The protocol should state “the intention to remove entry criteria violators and to identify the relevant criteria so that no accusations of selective analysis can later arise” (Lewis and Machin 1993)� Wiens and Zhao (2007) added a fifth requirement:

5� That the criteria for excluding a subject from the ITT analysis are developed a priori to prevent post hoc subset analyses from being presented as confirmatory�

To implement this requirement, they suggested “to form an independent subject eligibility committee composed of people not involved in the conduct of the study” (Wiens and Zhao 2007)�

There are occasions when the exclusion of entry criteria violators is important� Such a scenario is discussed by Lewis and Machin (1993, p� 649) in the following:

Suppose that drug X has been shown to be effective in mild and moderate disease, but there remains a question as to whether it is effective in severe cases, and so a trial in severe patients is started� Suppose also that some patients with moderate disease are inadvertently entered into this trial� Failure to exclude these patients from the analysis would bias the results in [favor] of drug X, or at any rate would lead to the answering of the wrong question�

On the other hand, there are exceptions to the exclusions of entry criteria violators� One such exception is given by Gillings and Koch (1991, pp� 415-416) as follows:

For example, subjects with a prior history of cardiovascular problems or women not using effective birth control are excluded from a particular trial for safety reasons� … In general, patients who should have been excluded for safety reasons but who do participate in the trial may not be appropriate candidates for exclusion from analysis because the original safety reason for exclusion, once violated, can be irrelevant from an efficacy perspective�

Newell (1992) recommends that randomization be done as late as possible when diagnostic and consent procedures have all been completed (see Figure 10�2 in Section 10�3�1)� This helps reduce entry criteria violators to a reasonable minimum� However, it is not always possible to do so (see Example 4 in Section 10�6�4)�

10.6.3 Subjects Who Do Not Receive the Randomized Treatment

In an RCT, subjects may or may not receive the randomized treatment, or may be switched to the other treatment for various reasons� As stated by Gillings and Koch (1991, p� 413), “��� analyzing subjects as though they were given a treatment that they did not actually receive is typically regarded as inappropriate from the perspective of clinical medicine�” An alternative approach is (1) to exclude such subjects, as in the PP analysis (see Section 10�3�3), or (2) to include such subjects according to the treatment actually received, as in the AT analysis (see Section 10�3�4)� Depending upon the reasons for not receiving the randomized treatment for a given study design, the PP or AT analysis may or may not introduce bias in the treatment comparison� Some examples are discussed in the following paragraphs�

Example 1a: Randomized treatment is not needed due to premature randomization

This example is discussed by Fergusson et al� (2002, p� 654) as follows:

In one trial of leucodepletion of red blood cells, patients were randomized before an operation rather than when a unit of red blood cells was requested by the surgical team (Houbiers et al� 1994)� The point of randomization was premature, and 36% of the patients randomized to the study did not need a transfusion� … Excluding all randomized patients who did not receive a unit of red blood cells will not bias the analysis, as long as allocation to treatment or control arm could not influence the likelihood that patients receive a transfusion�

Excluding all such patients actually enhances the precision of the estimate and provides a meaningful estimate of relative risk reduction for the clinician (Fergusson et al� 2002)� Fergusson et al� (2002, p� 654) recommend that “investigators should report an analysis of all randomized patients, as well as baseline characteristics for all patients excluded from the analysis�”

Example 1b: Randomized treatment is not needed in the treatment arm only

This example is discussed by Fergusson et al� (2002, p� 654) as follows:

In studies in which only patients allocated to one of two arms will receive the target intervention, excluding such patients will lead to biased results� For example, in a clinical trial of epidural anesthesia in childbirth, some women randomized to the epidural treatment arm did not need an epidural because their pain levels did not rise above their personal thresholds

[Loughnan et al� 2000]� Investigators should not exclude these patients from the analysis, as they cannot identify similar patients in the control arm�

Example 2: Hypothetical example comparing medical versus surgical treatment as described by Newell (1992)

“Suppose, for dramatic simplicity, that patients are randomly allocated to medical or surgical treatment� Those allocated to medical treatment are given medication immediately, while those allocated to surgery may require preparation, possibly waiting a few days or weeks for an available surgical [theater] time-slot� If a patient should happen to die before reaching the operating theater, a surgeon might be inclined to say ‘that death should not count against the surgical option-I didn’t get a chance to put my knife into the patient�’ The physician would rightly claim that if the surgeon could discount these-obviously the sickest-patients, the comparison would not be fair� In fact, the surgical [program] includes some (inevitable) delays, and all mortality occurring after the decision to perform surgery must correctly be assigned as part of the outcome of that [program]�”

Example 3: Subjects do not receive the randomized treatment due to administrative errors

Gillings and Koch (1991, p� 413) recommended that subjects be analyzed according to the treatments actually received “under the circumstance that only a few patients were wrongly randomized and that such administrative errors were not associated with the background characteristics of these patients or their prognosis�” They noted that in a large multicenter trial, there is likely at least one randomization error no matter how carefully the study procedures are implemented�

10.6.4 Subjects Who Do Not Have the Condition of Interest or the Underlying Disease

Subjects who were randomized might not have the condition of interest or the underlying disease either (1) because they were randomized before eligibility for inclusion could be confirmed (Example 4) or (2) due to poor or excessively broad eligibility criteria (Example 5)� Subjects who do not have the condition of interest or the underlying disease should not be excluded from the ITT analysis using the pragmatic approach for effectiveness; however, such subjects should be excluded from the analysis using the explanatory approach for efficacy (see Sections 10�2�2 and 10�2�3)� Such an analysis may be called a PP analysis in Example 4� However, it is not clear what analysis it should be called in Example 5 because such subjects actually meet the eligibility criteria�

Example 4: Subjects randomized before eligibility for inclusion can be confirmed

If investigators expect delays in obtaining clinical or laboratory information on subjects’ eligibility, they should ideally postpone randomization

until this information is available� However, even with sound methods and procedures and the best of intentions, instances will occur when subjects must be randomized before all the data needed to confirm eligibility is available�

In a study of an anti-influenza drug, oseltamivir carboxylate (Treanor et al� 2000), all consenting subjects who present to a doctor within 48 hours of development of influenza like symptoms are enrolled and randomized into the trial� The study protocol stipulates that only subjects who later give positive results on culture or serological tests for influenza infection will be included in the analysis� Of the 629 subjects who were randomized, 255 (40%) were later found to not have influenza� The study reported that in the 374 patients who were infected, the study drug reduced the duration of illness by 30% (p <0�001)� However, analysis of all 629 randomized patients shows a less dramatic but still significant effect of the study drug, with a reduced duration of 22% (p = 0�004)�

Fergusson et al� (2002, p� 653) states the following:

This clinical scenario mirrors real-life clinical situations where doctors need to treat patients before all information is available� The major issue in the interpretation of results becomes one of effectiveness versus efficacy [see Section 10�2�2] or explanatory approach versus pragmatic approach [see Section 10�2�3]� One would want to be sure that the benefit of the study drug to patients with the underlying condition outweighs the harm to patients exposed to the drug without possibility of benefit� Therefore, the primary presentation of the results should include all patients randomized into the study�

Example 5: Poor or excessively broad eligibility criteria

This example is discussed by Fergusson et al� (2002, p� 653) as follows:

Poorly defined or excessively broad eligibility criteria can lead to the inclusion of patients who do not have the condition of interest and are, therefore, unlikely to benefit from treatment� For example, studies of severe infections resulting in sepsis syndrome are often beset by difficulties in defining the condition of interest and the eligibility criteria [Bone et al� 1992; Cohen et al� 2001]� The diversity of clinical presentations often results in the enrollment of patients who meet eligibility criteria and receive treatment but are unlikely to benefit� … A large [RCT] of a drug that modulates immune responses in severe sepsis enrolled a very diverse study population because of broad eligibility criteria [Fisher et al� 1994]� A high proportion (175/893 or 20%) of enrolled patients that met the criteria did not have a confirmed infection, resulting in a study that yielded a less-than-optimal test of the researchers’ hypothesis� … Under such circumstances, the primary analysis should include all randomized patients� A secondary analysis that includes only patients who had the condition of interest and that is based on data collected before randomization can also be informative and unbiased�

10.6.5 Missing Data

Since missing data is unavoidable in clinical trials with human subjects, to include all randomized subjects in the ITT analysis, one has to resort to data imputation� However, such methods require an untestable assumption of missing at random to some degree� The best way to deal with the problem is to have as little missing data as possible (Lachin 2000; NRC 2010) (see Section 10�3�2)� In this regard, there is a need for a cultural shift, focusing on strategies to prevent missing data during the conduct and management of clinical trials, rather than relying on imperfect analytic methods (O’Neill and Temple 2012; Dziura et al� 2013)� How to deal with missing data in the analysis is beyond the scope of this book� The reader is referred to Rothmann, Wiens, and Chan (2012)�

At the request of the U�S� Food and Drug Administration (FDA), the Panel on the Handling of Missing Data in Clinical Trials was created by the National Research Council’s Committee on National Statistics� The panel’s work focused primarily on phase 3 confirmatory clinical trials that are the basis for the approval of drugs and devices� The panel concluded that a more principled approach to design and analysis in the presence of missing data is both needed and possible� Such an approach needs to focus on two critical elements: (1) careful design and conduct to limit the amount and impact of missing data, and (2) analysis that makes full use of information on all randomized participants and is based on careful attention to assumptions about the nature of the missing data underlying estimates of treatment effects� The panel presented 18 recommendations in the prevention and treatment of missing data in clinical trials (NRC 2010) (see Appendix 10�A)�

Li et al� (2014) performed a systematic review on the prevention and handling of missing data for patient-centered outcomes research (PCOR)� PCOR focuses on comparative effectiveness research that “helps people and their caregivers communicate and make informed health-care decisions, allowing their voices to be heard in assessing the value of health-care options” (PCOR, 2014)� Note that the PCOR could be RCTs or observational studies, such as cohort, case-control, and cross-sectional studies�

The authors identified 30 guidance documents that include at least one formal recommendation about how missing data should be prevented or handled� These 30 documents were published between 1996 and 2011 (5  were in draft form), with more than half published after 2008, and 12 of which were written for RCTs� Almost one-third (9 of 30, 30%) were prepared by the International Society for Pharmacoeconomics and Outcomes Research (ISPOR), followed by 5 of 30 (17%) prepared by the FDA and 4 of 30 (13%) by the Expert Working Group (Efficacy) of the International Conference on Harmonization of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH)�

The authors extracted 39 recommendations on the prevention and handling of missing data� The two-round consensus process and discussion

yielded 10 mandatory standards: 3 on study design, 2 on conduct, 3 on analysis, and 2 on reporting (see Appendix 10�B)� The four domains (i�e�, study design, conduct, analysis, and reporting) are in line with the recommendation by Hollis and Campbell (1999) (see Section 10�4�2)�

10.6.6 Noncompliance with Study Medication

Showing benefit in an explanatory trial does not necessarily lead to clinical benefits in practice using design (1) discussed in Section 10�2�3� In principle, from regulatory and practical points of view, medical intervention should be shown to provide clinical benefits in a real-world setting-that is, showing effectiveness rather than efficacy� See Section 10�2�2 for the definitions of effectiveness and efficacy� Therefore, noncompliant subjects should be included in the ITT analysis, and no adjustment for noncompliance is necessary� However, it should be noted that excessive noncompliance could render the results uninterpretable� See Section 10�4�3 for further discussion on noncompliance�

How much missing data is too much? Schulz and Grimes (2001) suggest that losses to follow-up less than 5% usually have little impact, whereas losses greater than 20% raise serious issues about study validity� In-between levels lead to problems somewhere in the middle� To support this, Kristman, Manno, and Cote (2004) demonstrated through simulation that substantial bias in the estimation of odds ratios under missing not at random (MNAR) conditions may arise in cohort studies, with loss to follow-up of 20%� However, the 5-20 rule of thumb has no statistical justification and oversimplifies the problem, as the bias resulting from missing data also depends on the missing data mechanism and the analytic method (Dziura et al� 2013)�

Gillings and Koch (1991) suggest a threshold value of 5% in the context of practically defining the ITT population: “All patients randomized who were known to take at least one dose of treatment and who provided any follow-up data for one or more key efficacy variables; in turn, ITT patients are allocated to treatments actually received�” They suggest that randomization errors and attrition through not taking at least one dose and not providing any follow-up data should be kept to 5%� It should be noted, however, that subjects who did not receive the surgical procedure in Example 2 in Section 10�6�3 should not be excluded� Therefore, it is important to realize that each therapeutic area may have its own unique issues, as noted in Section 10�1, so that recommendations for a given therapeutic area may not be applicable to another therapeutic area�

Deviations from the ITT principle (i�e�, subjects not being analyzed as randomized or post-randomization exclusion of subjects) may or may not be

acceptable, depending upon the situation, as discussed in Sections 10�6�2, 10�6�3, and 10�6�4, although they are limited in scope� Tremendous effort is needed to research examples in the literature, covering more scenarios for a wider range of therapeutic areas�

Hopefully, recent research efforts in the prevention and treatment of missing data in clinical trials (e�g�, NRC 2010; Dziura et al� 2013; Li et al� 2014) will reduce the amount of missing data and improve study quality in the near future� With improved quality of the trial conduct, it is hopeful that the number of protocol violators is minimal, as is the risk of falsely declaring NI using ITT analysis in settings where T truly is clinically inferior to S�

Trial objectives

1� The trial protocol should explicitly define (1) the objective(s) of the trial; (2) the associated primary outcome or outcomes; (3) how, when, and on whom the outcome or outcomes will be measured; and (4) the measures of intervention effects, that is, the causal estimands of primary interest� These measures should be meaningful for all study participants and estimable with minimal assumptions� Concerning the latter, the protocol should address the potential impact and treatment of missing data�

Reducing dropouts through trial design

2� Investigators, sponsors, and regulators should design clinical trials consistent with the goal of maximizing the number of participants who are maintained on the protocol-specified intervention until the outcome data is collected�

3� Trial sponsors should continue to collect information on key outcomes on participants who discontinue their protocol-specified intervention in the course of the study, except in those cases for which a compelling cost-benefit analysis argues otherwise, and this information should be recorded and used in the analysis�

4� The trial design team should consider whether participants who discontinue the protocol intervention should have access to and be encouraged to use specific alternative treatments� Such treatments should be specified in the study protocol�

5� Data collection and information about all relevant treatments and key covariates should be recorded for all initial study participants, whether or not participants received the intervention specified in the protocol�

Reducing dropouts through trial conduct

6� Study sponsors should explicitly anticipate potential problems of missing data� In particular, the trial protocol should contain a section that addresses missing data issues, including the anticipated amount of missing data and steps taken in trial design and trial conduct to monitor and limit the impact of missing data�

7� Informed consent documents should emphasize the importance of collecting outcome data from individuals who choose to discontinue treatment during the study, and they should encourage participants to provide this information whether or not they complete the anticipated course of study treatment�

8� All trial protocols should recognize the importance of minimizing the amount of missing data, and, in particular, they should set a minimum rate of completeness for the primary outcome(s) based on what has been achievable in similar past trials�

Treating missing data 9� Statistical methods for handling missing data should be specified

by clinical trial sponsors in study protocols and their associated assumptions stated in a way that can be understood by clinicians�

10� Single-imputation methods, such as last observation carried forward and baseline observation carried forward, should not be used as the primary approach to the treatment of missing data unless the assumptions that underlie them are scientifically justified�

11� Parametric models in general, and random-effects models in particular, should be used with caution, with all their assumptions clearly spelled out and justified� Models relying on parametric assumptions should be accompanied by goodness-of-fit procedures�

12� It is important that the primary analysis of the data from a clinical trial should account for the uncertainty attributable to missing data so that under the stated missing data assumptions, the associated significance tests have valid Type I error rates and the confidence intervals have the nominal coverage properties� For inverse probability weighting and maximum likelihood methods, this analysis can be accomplished by appropriately computing standard errors, using either asymptotic results or the bootstrap method� It is necessary to use appropriate rules for multiplying imputing missing responses and combining results across imputed datasets, because single imputation does not account for all sources of variability�

13� Weighted generalized estimating equation methods should be more widely used in settings where missing at random can be well justified and a stable weight model can be determined as a possibly useful alternative to parametric modeling�

14� When substantial missing data is anticipated, auxiliary information should be collected that is believed to be associated with reasons for missing values and with the outcomes of interest� This could improve the primary analysis through use of a more appropriate missing-at-random model or help carry out sensitivity analyses to assess the impact of missing data on estimates of treatment differences� In addition, investigators should seriously consider following up with all or a random sample of trial dropouts who have not withdrawn consent to ask them why they dropped out of the study and, if they are willing, to collect outcome measurements from them�

15� Sensitivity analyses should be part of the primary reporting of findings from clinical trials� Examining sensitivity to the assumptions about the missing data mechanism should be a mandatory component of reporting�

Understanding the causes and degree of dropouts in clinical trials

16� The FDA and the National Institutes of Health should make use of their extensive clinical trial databases to carry out a program of research, both internal and external, to identify common rates and causes of missing data in different domains and how different models perform in different settings� The results of such research can be used to inform future study designs and protocols�

17� The FDA and drug, device, and biologic companies that sponsor clinical trials should carry out continued training of their analysts to keep abreast on up-to-date techniques for missing data analysis� The FDA should also encourage continued training of their clinical reviewers to make them broadly familiar with missing data terminology and missing data methods�

18� The treatment of missing data in clinical trials, being a crucial issue, should have a higher priority for sponsors of statistical research, such as the National Institutes of Health and the National Science Foundation� Progress is particularly needed in several important areas, namely (1)  methods for sensitivity analysis and principled decision making based on the results from sensitivity analyses, (2) analysis of data where the missingness pattern is nonmonotone, (3) sample size calculations in the presence of missing data, and (4) design of clinical trials-in particular, plans for follow-up after treatment discontinuation (degree of sampling, how many attempts were made, etc�)—and (5) doable robust methods to more clearly understand their strengths and vulnerabilities in practical settings� The development of software that supports coherent missing data analyses is also a high priority�

Standards on study design

1� Define [the] research question, in particular, the outcome(s): • The study protocol should explicitly define (1) the objective(s) of the

study; (2) the intervention(s) of interest; (3) the associated primary outcome(s) that quantify the impact of interventions for a defined period; (4) how, when, and on whom the outcome(s) will be measured; (5) potential confounders, if relevant; and (6) the measures of intervention effects-that is, the parameters (estimands) that capture the causal effect of the intervention of primary interest� The parameters should be meaningful for all study participants and estimable with minimal assumptions� This standard applies to all study designs that aim to assess intervention effectiveness�

• Defining outcome(s) precisely and accurately requires careful attention, because the choice of outcome may have important implications for study design, implementation, expected amount of and reason for missing data, and methods for handling missing data� For example, the outcome could be defined in the population that includes all participants randomized to the study intervention(s), regardless of the intervention participants actually received (i�e�, ITT estimand), or the outcome could be defined in a more restricted population that includes only those who can tolerate the intervention for a given period� Outcome(s) could be measured after a short or long follow-up period and measured at one point in time or repeatedly over time� At a minimum, the primary outcome must be decided upon and adequately described in the study protocol� Imprecise and vague definition may lead to a lack of clarity in how to prevent and handle missing data�

2� Take steps in design and conduct to minimize missing data: • Investigators should explicitly anticipate potential problems of miss-

ing data� The study protocol should contain a section that addresses missing data issues and steps taken in study design and conduct to monitor and limit the impact of missing data� If relevant, the protocol should include the anticipated amount of and reasons for missing data and plans to follow up with participants� This standard applies to all study designs for any type of research question�

3� Prespecify statistical methods for handling missing data: • Statistical methods for handling missing data should be prespeci-

fied in the study protocol, and their associated assumptions stated

in a way that can be understood by all stakeholders� The reasons for missing data should be considered in the analysis� This standard applies to all study designs for any type of research question�

Standards on study conduct

4� Continue collecting information on key outcomes:

• Whenever a participant discontinues some or all types of participation in a research study, the investigator should document the following: (1) the reason for discontinuation, (2) who decided that the participant would discontinue, and (3) whether the discontinuation involves some or all types of participation� Investigators should continue to collect information on key outcomes for participants who discontinue the protocol-specified intervention� This standard applies to prospective study designs that aim to assess intervention effectiveness�

5� Monitor missing data:

• For studies that include a data and safety monitoring board, the board should review plans for and the implementation of the prevention and handling of missing data� The board should review completeness and timeliness of data and recommend modifications as appropriate�

Standards on analysis

6� Account for uncertainty in handling missing data in the analysis:

• Statistical inference of intervention effects or measures of association should account for statistical uncertainty attributable to missing data� This means that under the stated missing data assumptions of the methods used for imputing missing data, the associated significance tests should have valid Type I error rates and confidence intervals should have the nominal coverage properties� This standard applies to all study designs for any type of research question�

7� Discourage single-imputation methods:

• Single-imputation methods, such as the last observation carried forward and baseline observation carried forward, generally should not be used as the primary approach for handling missing data in the analysis� This standard applies to all study designs for any type of research question�

8� Conduct sensitivity analysis:

• Examining sensitivity to the assumptions about the missing data mechanism (i�e�, sensitivity analysis) should be a mandatory component of the study protocol, analysis, and reporting� This standard applies to all study designs for any type of research question�

Standards on reporting

9� Account for all participants entered in the study when reporting the results:

• All participants who enter the study should be accounted for when reporting the results, whether or not they are included in the analysis� Describe and justify any planned reasons for excluding participants from analysis� This standard applies to all study designs for any type of research question�

10� Report on data completeness and strategies applied to handle missing data:

• Report on data completeness and how missing data was handled in the analysis to facilitate interpretation of study results� The potential influence of missing data on the study results should be described� This standard applies to all study designs for any type of research question�

Alshurafa M, Briel M, Akl EA, Haines T, Moayyedi P, Gentles SJ, Rios L, Tran C, Bhatnagar N, Lamontagne F, Walter SD, and Guyatt GH (2012)� Inconsistent Definitions for Intention-to-Treat in Relation to Missing Outcome Data: Systematic Review of the Methods Literature� PLoS ONE 7(11): e49163� doi:10�1371/journal�pone�0049163� https://www�plosone�org/article/ info%3Adoi%2F10�1371%2Fjournal�pone�0049163 (Accessed: September 9, 2013)�

Bone RC, Balk RA, Cerra FB, Dellinger RP, Fein AM, Knaus WA, Schein RMH, and Sibbald WJ (1992)� Definitions for Sepsis and Organ Failure and Guidelines for the Use of Innovative Therapies in Sepsis� The ACCP/SCCM Consensus Conference Committee� American College of Chest Physicians/Society of Critical Care Medicine� Chest, 101:1644-1655�

Bradford-Hill A (1961)� Principles of Medical Statistics� New York, NY: Oxford University Press�

Brittain E and Lin D (2005)� A Comparison of Intent-to-Treat and Per-Protocol Results in Antibiotic Noninferiority Trials� Statistics in Medicine, 24:1-10�

Cochrane Collaboration (2002)� Module 14: Further Issues in Meta-Analysis (Cochrane Handbook for Systematic Reviews of Interventions)� https://www�cochranenet�org/openlearning/PDF/Module_14�pdf (Accessed: September 10, 2013)�

Cohen J, Guyatt G, Bernard GR, Calandra T, Cook D, Elbourne D, Marshall J, Nunn A, and Opal S (2001), for a UK Medical Research Council International Working Party� New Strategies for Clinical Trials in Patients with Sepsis and Septic Shock� Critical Care Medicine Journal, 29:880-886�

CPMP Note for Guidance (III/3630/92-EN; 1995)� Biostatistical Methodology in Clinical Trials in Applications for Marketing Authorizations for Medicinal Products� Statistics in Medicine, 14:1659-682�

D’Agostino RB Sr, Massaro JM, and Sullivan LM (2003)� Non-inferiority Trials: Design Concepts and Issues: The Encounters of Academic Consultants in Statistics� Statistics in Medicine, 22:169-186�

Dziura JD, Post LA, Zhao Q, Fu Z, and Peduzzi P (2013)� Strategies for Dealing with Missing Data in Clinical Trials: From Design to Analysis� Yale Journal of Biology and Medicine, 86:343-358�

Ebbutt AF and Frith L (1998)� Practical Issues in Equivalence Trials� Statistics in Medicine, 17:1691-1701�

European Agency for the Evaluation of Medicinal Products, Committee for Proprietary Medicinal Products (2000)� Points to Consider on Switching Between Superiority and Noninferiority� https://www�ema�europa�eu/docs /en_GB/document_library/Scientific_guideline/2009/09/WC500003658�pdf (Accessed: August 25, 2013)�

Fergusson D, Aaron S, Guyatt G, and Herbert P (2002)� Post-randomisation Exclusions: The Intention-to-Treat Principle and Excluding Patients from Analysis� British Medical Journal, 325:652-654�

Fisher CJ Jr, Dhainaut JF, Opal SM, Pribble JP, Balk RA, Slotman GJ, Iberti TJ, Rackow EC, Shapiro MJ, Greenman RL, Reines HD, Shelly MP, Thompson BW, LaBrecque JF, Michael A, Catalano MA, Knaus WA, and Sadoff JC (1994)� Recombinant Human Interleukin 1 Receptor Antagonist in the Treatment of Patients with Sepsis Syndrome� Results from a Randomized, Double -Blind, Placebo -Controlled Trial� Phase III rhIL1-ra Sepsis Syndrome Study Group� Journal of American Medical Association, 271:1836-1843�

Fisher LD, Dixon DO, Herson J, Frankowski RK, Hearron MS, and Peace KE (1990)� Intention to Treat in Clinical Trials, in Statistical Issues in Drug Research and Development (American Statistical Associations Group), ed Peace K E, Marcel Dekker� New York: 331-350�

Fleming TR (2008)� Current Issues in Non-inferiority Trials� Statistics in Medicine, 27:317-332�

Fleming TR, Odem-Davis K, Rothmann MD, and Shen YL (2011)� Some Essential Considerations in the Design and Conduct of Non-inferiority Trials� Clinical Trials, 8:432-439�

Garrett AD (2003)� Therapeutic Equivalence: Fallacies and Falsification� Statistics in Medicine, 22:741-762�

Gillings D and Koch G (1991)� The Application of the Principle of Intention-to-Treat to the Analysis of Clinical Trials� Drug Information Journal, 25:411-424�

Gonzalez CD, Bolaños R, and de Sereday M (2009)� Editorial on Hypothesis and Objectives in Clinical Trials: Superiority, Equivalence and Non-inferiority, Thrombosis Journal, 7:3 doi: 10�1186/1477-9560-7-3�

Gøtzsche PC (2006)� Lessons from and Cautions about Noninferiority and Equivalence Randomized Trials [Editorial]� Journal of the American Medical Association, 295:1172-1174�

Gravel J, Opartny L, and Shapiro S (2007)� The Intention-to-Treat Approach in Randomized Trials: Are Authors Saying What They Do and Doing What They Say? Clinical Trials, 4:350-356�

Greene CJ, Morland LA, Durkalski VL, and Frueh BC (2008)� Noninferiority and Equivalence Designs: Issues and Implications for Mental Health Research, Journal of Traumatic Stress, 21(5):433-443�

Hauck WW, Anderson S� (1999)� Some Issues in the Design and Analysis of Equivalence Trials� Drug Information Journal, 33:109-118�

Heritier SR, Gebski VJ, and Keech AC (2003)� Inclusion of Patients in Clinical Trial Analysis: The Intention-to-Treat Principle� Medical Journal of Australia, 179:438-440�

Hernan MA and Hernandez-Diaz S (2012)� Beyond the Intention-to-Treat in Comparative Effectiveness Research� Clinical Trials, 9(1):48-55�

Hill CL, LaValley MP, and Felson DT (2002)� Secular Changes in the Quality of Published Randomized Clinical Trials in Rheumatology� Arthritis Rheum, 46:779-784�

Hollis S and Campbell F (1999)� What Is Meant by Intention-to-Treat Analysis? Survey of Published Randomised Controlled Trials� British Medical Journal, 319(7211):670-674�

Houbiers JG, Brand A, van de Watering LM, Hermans J, Verwey PJ, Bijnen AB, Pahlplatz P, Schattenkerk ME, Wobbes T, de Vries JE, Klementschitsch P, van de Maas AHM, and van de Velde CJH (1994)� Randomised Controlled Trial Comparing Transfusion of Leucocyte Depleted or Buffy Coat Depleted Blood in Surgery for Colorectal Cancer� Lancet, 344:573-578�

International Conference on Harmonization (ICH) E9 Guideline (1998)� Statistical Principles for Clinical Trials. https://www�fda�gov/downloads/Drugs /GuidanceComplianceRegulatoryInformation/Guidances/UCM073137�pdf (Accessed: September 27, 2012)�

Kaul S and Diamond GA (2007)� Making Sense of Noninferiority: A Clinical and Statistical Perspective on Its Application to Cardiovascular Clinical Trials� Progress in Cardiovascular Diseases, 49(4):284-299�

Kristman V, Manno M, and Cote P (2004)� Loss to Follow-up in Cohort Studies: How Much Is Too Much? European Journal of Epidemiology, 19(8):751-760�

Lachin JM (2000)� Statistical Considerations in the Intent-to-Treat Principle� Controlled Clinical Trials, 21(3):167-189�

Le Henanff A, Giraudeau B, Baron G, and Ravaud P (2006)� Quality of Reporting of Noninferiority and Equivalence Randomized Trials� Journal of the American Medical Association, 295:1147-1151�

Lesaffre E (2008)� Superiority, Equivalence and Non-inferiority Trials� Bulletin of the NYU Hospital for Joint Diseases, 66(2):150-154�

Lewis JA and Machin D (1993)� Intention to Treat: Who Should Use ITT? British Journal of Cancer, 68:647-650�

Li T, Hutfless S, Scharfstein DO, Daniels MJ, Hogan JW, Little RJA, Roy JA, Law AH, and Dickersin K (2014)� Standards Should Be Applied in the Prevention and Handling of Missing Data for Patient-Centered Outcomes Research: A Systematic Review and Expert Consensus� Journal of Clinical Epidemiology, 67:15-32�

Loughnan BA, Carli F, Romney M, Dore CJ, and Gordon H (2000)� Randomized Controlled Comparison of Epidural Bupivacaine versus Pethidine for Analgesia in Labour� British Journal of Anaesthesia, 84:715-719�

Moher D, Hopewell S, Schulz KF, Montori V, Gotzsche PC, Devereaux PJ, Elbourne D, Egger M, and Altman DG (2010)� CONSORT 2010 Explanation and Elaboration: Updated Guidelines for Reporting Parallel Group Randomised Trials� British Medical Journal, 340:c869�

Moher D, Schulz KF, and Altman DG (2001)� The CONSORT Statement: Revised Recommendations for Improving the Quality of Reports of Parallel-Group Randomized Trials� Annals of Internal Medicine, 134:657-662�

National Research Council (2010)� The Prevention and Treatment of Missing Data in Clinical Trials� Panel on Handling Missing Data in Clinical Trials� Committee on National Statistics, Division of Behavioral and Social Sciences and Education� Washington, DC: The National Academies Press� https://csph�ucdenver �edu/sites/kittelson/Bios6648-2013/Lctnotes/2013/NASmissingData�pdf (Accessed: January 5, 2014)�

Neal K (2009)� Efficacy vs� Effectiveness� https://getedited�wordpress�com /2009/10/26/efficacy-vs-effectiveness (Accessed: November 3, 2013)�

Newell DJ (1992)� Intention-to-Treat Analysis: Implications for Quantitative and Qualitative Research� International Journal of Epidemiology, 21:837-841�

Ng T-H (2001)� Choice of delta in equivalence testing, Drug Information Journal, 35:1517-1527�

Nordic Council on Medicine (1989)� Good Clinical Practice� Uppsala, Sweden: Nordic Guidelines, NLN Publication No� 28�

O’Neill RT and Temple R (2012)� The Prevention and Treatment of Missing Data in Clinical Trials: An FDA Perspective on the Importance of Dealing with It� Journal of Clinical Pharmacy and Therapeutics, 91(3):550-554�

Pater C (2004)� Equivalence and Noninferiority Trials: Are They Viable Alternatives for Registration of New Drugs? (III)� Current Controlled Trials in Cardiovascular Medicine, 5:8�

Peto R, Pike MC, Armitage P, Breslow NE, Cox DR, Howard SV, Mantel N, McPherson K, Peto J, and Smith PG (1976)� Design and Analysis of Randomized Clinical Trials Requiring Prolonged Observation of Each Patient� I� Introduction and Design� British Journal of Cancer, 34:585-612�

Piaggio G, Elbourne DR, Altman DG, Pocock SJ, and Evans SJ (2006)� Reporting of Noninferiority and Equivalence Randomized Trials: An Extension of the CONSORT Statement� Journal of American Medical Association, 295:1152-1160�

Pocock SJ (2003)� Pros and Cons of Noninferiority Trials� Blackwell Publishing Fundamental & Clinical Pharmacology, 17:483-490�

Polit DF, and Gillespie B (2009)� The Use of the Intention-to-Treat Principle in Nursing Clinical Trials� Nursing Research, 59:391-399�

Polit DF, and Gillespie BM (2010)� Intention-to-Treat in Randomized Controlled Trials: Recommendations for a Total Trial Strategy� Research in Nursing & Health, 33(4):355-368�

Roland M, and Torgerson DJ (1998)� Understanding Controlled Trials� What Are Pragmatic Trials? British Medical Journal, 316:285�

Rothmann MD, Wiens BL, and Chan ISF (2011)� Design and Analysis of Non-Inferiority Trials. Boca Raton, FL: Chapman & Hall/CRC�

Sanchez MM and Chen X (2006)� Choosing the Analysis Population in Non-inferiority Studies: Per Protocol or Intent-to-Treat� Statistics in Medicine, 25:1169-1181�

Schulz KF, Altman DG, and Moher D (2010)� CONSORT 2010 Statement: Updated Guidelines for Reporting Parallel Group Randomised Trials� British Medical Journal, 340:c332�

Schulz KF, and Grimes D (2002)� Sample Size Slippages in Randomised Trials: Exclusions and the Lost and Wayward� Lancet, 359(9308):781-785�

Schumi J and Wittes JT (2011)� Through the Looking Glass: Understanding Noninferiority, Trials, 12:106�

Schwartz D and Lellouch J (1967)� Explanatory and Pragmatic Attitudes in Therapeutic Trials� Journal of Chronic Diseases, 20:637-648�

Scott IA (2009)� Non-inferiority Trials: Determining Whether Alternative Treatments Are Good Enough� Medical Journal of Australia, 190: 326-330�

Sheng D and Kim MY (2006)� The Effects of Non-compliance on Intent-to-Treat Analysis of Equivalence Trials� Statistics in Medicine, 25:1183-1199�

Thaul S (2012)� How FDA Approves Drugs and Regulates Their Safety and Effectiveness� CRS (Congressional Research Service) Report R41983 2012� https://www�fas�org/sgp/crs/misc/R41983�pdf (Accessed: November 3, 2013)�

Treadwell J, Uhl S, Tipton K, Singh S, Santaguida L, Sun X, Berkman N, Viswanathan M, Coleman C, Shamliyan T, Wang S, Ramakrishnan R, and Elshaug A (2012)� Assessing Equivalence and Noninferiority� Methods Research Report� (Prepared by the EPC Workgroup under Contract No� 290-2007-10063�) AHRQ Publication No� 12-EHC045-EF� Rockville, MD: Agency for Healthcare Research and Quality, June 2012� https://www�effectivehealthcare�ahrq�gov/ehc/products/365/1154 /Assessing-Equivalence-and-Noninferiority_FinalReport_20120613�pdf (Accessed: March 1, 2014)�

Treanor JJ, Hayden FG, Vrooman PS, Barbarash R, Bettis R, Riff D, Kinnersley N, Ward P, and Mills RG� Efficacy and Safety of the Oral Neuraminidase Inhibitor Oseltamivir in Treating Acute Influenza: A Randomized Controlled Trial� Journal of the American Medical Association, 283:1016-1024�

U�S� Food and Drug Administration (1988)� Guideline for the Format and Content of the Clinical and Statistical Sections of New Drug Applications. Rockville, MD: U�S� Department of Health and Human Services�

U�S� Food and Drug Administration (2010)� Draft Guidance for Industry: Non-inferiority Clinical Trials. https://www�fda�gov/downloads/Drugs /GuidanceComplianceRegulatoryInformation/Guidances/UCM202140�pdf (Accessed: August 25, 2013)�

Wangge G, Klungel OH, Roes KCB, de Boer A, Hoes AW, et al� (2010)� Room for Improvement in Conducting and Reporting Non-Inferiority Randomized Controlled Trials on Drugs: A Systematic Review� PLoS ONE 5(10): e13550� doi:10�1371/journal�pone�0013550�

Wertz RT (1995)� Intention to Treat: Once Randomized, Always Analyzed� Clinical Aphasiology, 23:57-64�

Wiens BL and Zhao W (2007)� The Role of Intention to Treat in Analysis of Noninferiority Studies� Clinical Trials, 4:286-291�