Performance Evaluation of Image Analysis Methods | 14

ABSTRACT

With the rapid development of image analysis techniques [1-39], an increasing interest has been directed toward the performance evaluation of these techniques. Commonly used evaluation criteria may include accuracy, precision, efficiency, consistency, reproducibility, robustness, etc. In order to make assessments based on these criteria (e.g., the accuracy), observers often compare the results obtained using these techniques with the corresponding ground truth or the gold standard. The ground truth may be seen as a conceptual term relative to the knowl-

edge of the truth concerning a specific question. The gold standard may be seen as the concrete realization of the ground truth or an accepted surrogate of truth [40]. Due to the complexity of the structures of living objects and the irregularity of the anomalies, the ground truth or the gold standard of these structures and anomalies is unknown, inaccurate, or even difficult to establish. As a result, subjective criteria and procedures are often used in the performance evaluation, which can lead to inaccurate or biased assessments. This chapter describes two approaches for the precise and quantitative eval-

uation of the performance of image analysis techniques. Instead of comparing the results obtained by the image analysis techniques with the ground truth or the gold standard, or using some statistical measures, these two approaches directly assess the image analysis technique itself. The first approach gives analytical assessments of the performance of each step of the image analysis technique. The second approach is focused on the validity of the image analysis technique with its fundamental imaging principles. The first approach is applied to the iFNM model-based image analysis

method (Chapter 10), which consists of three steps: detection, estimation, and classification. (1) For detection performance, probabilities of over-and underdetection of the number of image regions are defined, and the corresponding formulas in terms of model parameters and image quality are derived. (2) For estimation performance, both EM and CM algorithms are showed to produce asymptotically unbiased ML estimates of model parameters in the case of no-overlap. Cramer-Rao bounds of the variances of these estimates are de-

rived. (3) For classification performance, a misclassification probability for the Bayesian classifier is defined, and a simple formula, based on parameter estimates and classified data, is derived to evaluate classification errors. The results obtained by applying this method to a set of simulated im-

ages show that, for images with a moderate quality (SNR > 14.2 db, i.e., µ/σ ≥ 5.13), (1) the number of image regions suggested by the detection criterion is correct and the error-detection probabilities are almost zero; (2) the relative errors of the weight and the mean are less than 0.6%, and all parameter estimates are in the Cramer-Rao estimation intervals; and (3) the misclassification probabilities are less than 0.5%. These results demonstrate that for this class of image analysis methods, the detection procedure is robust, the parameter estimates are accurate, and the classification errors are small. A strength of this approach is that it not only provides the theoretically

approachable accuracy limits of image analysis techniques, but also shows the practically achievable performance for the given images. The second approach is applied to the cFNM model-based image analysis

method (Chapter 11), which also consists of three steps: detection, estimation, and classification. (1) For detection performance, although the cFNM model-based image analysis method uses a sensor array eigenstructure-based approach (which is different from the information criterion-based approach used in the iFNM model-based image analysis method), the probabilities of over-and under-detection of the number of image regions are defined in a similar way. The error-detection probabilities are shown to be functions of image quality, resolution, and complexity. (2) For estimation performance, when the EM algorithm is used, the performances of iFNM and cFNM model-based image analysis methods are similar. (3) For classification performance, the cFNM model-based image analysis method uses the MAP criterion to assess its validity with the underlying imaging principles and shows that the results obtained by MAP are toward the physical ground truth that is to be imaged.

This section analyzes the iFNM model-based image analysis method of Chapter 10. It evaluates its performance at three steps: detection, estimation, and classification.