ABSTRACT

Suppose each unit of the population U = (1, . . . , i . . . , N ) consists of a number of subunits and hence may be regarded as a cluster, the ith unit forming cluster of Mi subunits with a total Yi for the variable y of interest; i = 1, . . . , N . For example, we may consider districts as clusters and villages in them as subunits or cluster elements. Then quantity of interest is Y = N1 Yi or

Y = ∑N

,

where Yij is the value of the j th element of the ith cluster and

is the ith cluster mean of y. Now, often it is not feasible to survey all the Mi elements of the ith cluster to ascertain Yi.

Instead, a policy that may be implemented is to first take a sample s of n clusters out of U according to a suitable design p and then from each selected cluster, i, take a further sample, of mi elements out of the Mi elements in it following another suitable scheme of selection of these elements; the selection procedures in all selected clusters have to be independent from each other. Then one may derive suitable unbiased estimators, say, Ti of Yi for i ∈ s and derive a final estimator for Y or Y . This is two-stage sampling, the clusters forming the primary or first-stage units (psu or fsu) and the elements within the fsus being called the second stage units (ssu). Further stages may be added allowing the elements to consist of subelements, the third-stage units to be subsampled and so on, leading, in general, to multistage sampling. We will now discuss estimation of totals, or means and estimation of variances of estimators of totals, or means in multistage sampling.