Verification Bias and Test Accuracy
Consider a standard diagnostic test, say the blood glucose test for type 2 diabetes, where after fasting for at least 8 hours, the patient is declared positive if the measured level is in excess of 126 mg/dL. If this is the case, the person is often given an oral glucose tolerance test, which is considered a gold standard for the disease. For those with levels between 111 and 125 mg/dL, problems with glucose metabolism are suspected, while for those with levels below 111 mg/dL, type 2 diabetes is not suspected. For this latter group, the patient would not ordinarily be given the oral glucose tolerance test, while for those between 111 and 125 mg/dL, a follow-up test might be appropriate if other factors point to diabetes. In this example, the diagnostic test value Y falls into one of three categories: (1) those with values below 111 mg/dL, (2) those between 111-125 mg/dL, and (3) those with values of Y > 125 mg/dL. Suppose the oral glucose tolerance test is considered a gold standard, where all those in the third group undergo the gold standard for disease veriﬁcation, say 23% are veriﬁed for the second group, and none are veriﬁed for the ﬁrst group. This is a typical case of veriﬁcation bias because the usual estimates of test accuracy are biased if they are estimated using only the veriﬁed cases. For simplicity, suppose the test Y is positive if Y > 125 mg/dL, otherwise the test is declared negative. If only the validated cases are used to estimate, say sensitivity, all those in the third group would be veriﬁed with the gold standard, while only a subset of those with values Y ≤ 125 are veriﬁed by the gold standard. These usual estimates would be biased, that is to say, the sensitivity would be too high, compared to those estimated by referring all patients for disease veriﬁcation.