ABSTRACT

Given a multivariate parameter, ? and appropriate data, prevalence estimation deals with the counting of the number of entries in ? that depart from their hypothesized null values. The problem has two main motivations: First, in the case, a population consists of two sub-populations, we may want to know the prevalence of each. Examples include a sub-population of respondents and a sub-population of non-respondents in personalized medicine; active and inactive subjects in neuroimaging; associated and non-associated genes in genetic studies (GWAS); tolerant and intolerant animals in toxicology, etc. Second, various multiple testing algorithms may bene?t from knowledge of the effect’s prevalence. The Adaptive Benjamini-Hochberg algorithm is one of many such algorithms. The algorithm """adapts""" by estimating the signal’s prevalence before a multiple testing stage. In this chapter, we will cover the vast literature on prevalence estimators, try to organize it along design principles and statistical guarantees, with recommendations to the practitioner.