ABSTRACT

This chapter introduces the number one statistic, the mean, and the runner-up, the correlation coefficient, which when used together as discussed here yield the average correlation, providing a fruitful statistical data mining measure. The average correlation, along with the correlation coefficient, provides a quantitative criterion for assessing competing predictive models and the importance of the predictor variables. The chapter provides a SAS subroutine for calculating the average correlation. Two essential characteristics of a predictive model are its reliability and validity, terms that are often misunderstood and misused. Reliability refers to the model— yielding consistent results. Model validity refers to the extent to which the model measures what it is intended to measure on a given criterion. There are two other aspects of model validity: face validity and content validity. Assessing competing models is a task of the model builder who can compare two sets of measures.