ABSTRACT

Irrespective of how it is estimated, AUC is a realization of a random variable, and as such is subject to variability. Identified are three sources of variability affecting AUC estimates: case sampling, between-reader, and within-reader. Unless strict replication of readings is used, the within-reader component cannot be separated from the others. Modern methods can analyze datasets without the need for such replication. AUC depends on a case-set index{c}, c = 1,2,...,C, where each case-set is a collection of randomly sampled specified numbers of non-diseased and diseased cases. Described are two resampling methods, the jackknife and the bootstrap, of estimating variability of almost any statistic. Estimation of jackknife and bootstrap case sampling variability, using R-code, is compared to an analytic nonparametric method that is applicable to the empirical AUC. The concept of a calibrated data simulator is introduced and illustrated with a simple example. A source of variability not generally considered, namely threshold variability, is introduced, and a cautionary note is struck with respect to indiscriminate usage of the empirical AUC.