ABSTRACT

This chapter introduces the binary data collection paradigm, where the ground truth has two values, non-diseased and diseased, while the radiologist's response is likewise limited to two values: Yes, the patient is diseased, or no, the patient is non-diseased. It is also termed the yes/no paradigm. It leads to a 2 × 2 decision vs. truth state table, and the definitions of true positive, false positive, and their complements, false negative and true negative, respectively. An analogy is used to explain the terms sensitivity and specificity. Corresponding fractions true positive fraction (TPF) and false positive fraction (FPF) are defined. Sensitivity and specificity are estimated by TPF and (1 – FPF), respectively. The concepts of disease-prevalence, both in population and laboratory studies are explained. These are used to develop expressions for positive and negative predictive values (PPV and NPV), which are more relevant to clinicians than sensitivity and specificity. The expressions are illustrated with R code using values typical for screening mammography. It is shown that overall accuracy is a poor measure of performance as it is strongly influenced by prevalence. The reasons why PPV and NPV are irrelevant to laboratory studies are noted. The binary paradigm is the starting point for understanding the more common ratings paradigm described below.