ABSTRACT

In a clinical environment, trained human observers are often necessary components of a diagnostic system (e.g., interpretation of cardiograms or the assessment of mammograms). Because of the incorporation of human observers (readers), the accuracy of a system may re·ect personal characteristics of the human observers such as experience, physiological abilities, and personal preferences. Unfortunately. Even within a well dened group, the characteristics of trained observers differ substantially (e.g., Beam, 1998; Gur et  al., 2007). When assessing system accuracy, it is often appropriate to consider a target “population of readers” as a part of the conceptual framework. In other words, one visualizes the results of the evaluation of a diagnostic system not in terms of the readers who happen to have participated in the study but in the context of the target population of radiologists likely to use the technology. The selection of radiologists and design of multireader studies are described in several books (Swets and Pickett, 1982; Zhou et al., 2002).