ABSTRACT

In epidemiological studies, one is constantly interested in estimating the sizes of diseased populations, such as diabetes patients and cancer patients, for the purpose of assessing the completeness of their registries. There are also disease populations whose individuals are difficult to access, such as infectious drug users and AIDS patients. For these elusive populations, their sizes are of importance by themselves. Capture-recapture techniques are frequently employed when multiple incomplete lists of individuals are available, Hook and Regal, WHO, IWGDMF. The most popular method for this kind of multi-list data in the literature is the family of hierarchical log-linear models introduced by Fienberg. Fienberg applied hierarchical log-linear models to homogeneous populations. Both the vast differences among estimates and the large difference between the estimate from the saturated log-linear model and the true population size are possibly caused by the heterogeneity among individuals. For heterogeneous populations, one has to deal with the issue of singularity or non-identifiability.