ABSTRACT

The empirical AUC is defined as the area under the empirical ROC plot, which consists of straight-line segments connecting adjacent operating points, including the trivial ones at (0,0) and (1,1). It has the advantage of being independent of any modeling assumptions and the availability of analytical methods for determining its sampling behavior. Its main disadvantage is that it yields no insight into factors limiting the observer's performance. Empirical AUC based analysis is often termed nonparametric. Notation is introduced for labeling individual cases that is used in subsequent chapters. Formulae are presented for computing empirical operating points from ratings data. The formulae are illustrated using the dataset introduced in Chapter 03. An important theorem, often termed Bamber's theorem, relating the empirical area under the ROC to a formal statistic, known as the Wilcoxon, is derived. The importance of the theorem derives from its applications to nonparametric analysis of ROC data. An online appendix describes details of calculating the Wilcoxon statistic using an R-coded example. Since the empirical AUC always yields a number, the researcher could be unaware about unusual behavior of the empirical ROC curve, so it is always a good idea to plot the data and look for evidence of large extrapolations. An example would be data points clustered at low FPF values, which imply a large AUC contribution, unsupported by intermediate operating points from the line connecting the uppermost nontrivial operating point to (1,1).