ABSTRACT
As discussed in Chapter 2, the randomization model is often more appropriate
than the population model, as it might be for a randomized clinical trial, which
is usually based on a “convenience sample” rather than a random sample.
In that case, a permutation test is the “platinum standard” (Tukey, 1993).
However, in practice there are situations where a permutation test is not
performed although it is doable and appropriate. Berger (2009) discussed and
criticized this in a Socratic dialogue where Socrates asked: “If you can observe
the exact p-value, then why would you go on to attempt to approximate it?”
Permutation tests also have disadvantages. On the one hand, permutation
tests are computer intensive, as there are a huge number of possible permuta-
tions in the case of large samples. Although this point is more pronounced in
the case of more than two groups (see Chapter 9), it is also relevant for the
two-sample problem. For instance, for n1 = n2 = 20, there are more than 137
billion permutations (1 billion is defined here as 109). Obviously, bootstrap
methods are computer intensive too. However, the disadvantage is declining
over time. Very efficient algorithms were developed (see, e.g., Good, 2000,
Chapter 13). Moreover, advances in computer power are huge. Modern PCs
probably were unimaginable for R. A. Fisher when he invented permutation
tests in the 1930s. In addition, there is the possibility of performing approxi-
mate permutation tests based on a random sample of permutations.