ABSTRACT

This chapter presents methods for using R to analyze data from two-phase samples and capture-recapture studies. Two-phase sampling, sometimes called double sampling, is useful when the key variables of interest are relatively expensive to measure but related variables can be measured fairly easily—for example, when information on an inexpensive, but possibly inaccurate, screening test from a large initial sample is used to define strata for a subsample of persons to be given a more comprehensive diagnostic test. This chapter shows how to analyze data from simple two-phase sampling designs, using an example where information from a large simple random sample is used to select a stratified random sample in the second phase.

The size of a population can be estimated by comparing multiple independent samples randomly selected from it. In the simplest form of capture-recapture estimation, two independent simple random samples are taken and the number of population members found in both samples is used to estimate the population size. This chapter introduces R functions that can be used to estimate population sizes, with associated confidence intervals based on likelihood ratio tests or bootstrap, from two-sample and multiple-sample capture-recapture studies. Code and output are given for examples in Chapter 13 of Sampling: Design and Analysis, Third Edition.