ABSTRACT

In a cluster randomized trial, research participants are not sampled independently, but as a group or cluster. This chapter explains why clustering leads to problems of analysis and describes several different possible approaches, including the use of summary statistics for the cluster, robust standard errors, multilevel modeling, generalized estimating equations, and the stepped wedge design. Intra-cluster correlation coefficient is the correlation between pairs of subjects where each pair is chosen at random from different cluster. To estimate the sample size for a cluster design, therefore, we must also estimate the design effect. Robust standard errors are a large sample approximation which breaks down when the number of clusters is small. The chapter describes the need for an increased sample size and the role of the intra-cluster correlation coefficient in estimating this. It also describes why cluster randomization leads to practical difficulties and possible biases in recruitment and data collection.