ABSTRACT

In this chapter we will consider the simple proportion. This is the basic building block from which most of the measures developed subsequently in this book are derived. We aim to determine the frequency of a binary characteristic in some population. We take a sample of size n. This most often means n individuals-it can also refer to some other kind of unit, but for purposes of exposition, with health-related applications particularly in mind, we will assume that the unit of sampling is the individual person. The standard assumptions are made, namely that sampling is representative, and all units sampled are statistically independent of one another. We find that r out of the n individuals are positive for the characteristic of interest. We report that the proportion is p = r/n. This is called the empirical estimate, and is also the maximum likelihood estimate based on the binomial distribution, defined in Section 3.4.4. It serves as the obvious point estimate of the theoretical proportion or population proportion π. The empirical estimate is an unbiased estimate of π in the usual statistical sense that π is the average value expected for p. The larger the sample size n, the better p is as an estimate of π, in the sense of approximating closely to π.