ABSTRACT

Correlational methods deal with relationships among phenomena as they exist in natural situations. A correlation can be defined in numerous ways: as the strength of association between phenomena, as the degree to which one phenomenon can be predicted from another phenomenon or as the degree to which phenomena covary. In a sense, it is a scientific version of the kinds of natural observation in which one relates one thing to another. Underlying these observations, we generally can find some theoretical inference concerning those relationships. For example, the clinicianmay observe that clientswith alcoholism often have fathers with histories of alcoholism. A formal study of this observation would involve obtaining information about alcoholism status in clients and their fathers. We can then tabulate these data in what is called a contingency table of the type shown in Table 6.1. It can be seen there that the alcoholic clients had alcoholic fathers far more frequently than the nonalcoholic patients. We can therefore say that there is a high correlation between alcoholism in parent and child. However, in research applications, it is necessary to know how high. The value typically used to express how high the correlation is, or the strength of association, is called the correlation coefficient. A correlation coefficient is an index of the strength of association between two variables. It is a number that can range between −1 and +1. A value of zero reflects complete absence of correlation; a −1 indicates a perfect negative correlation whereas +1 represents a perfect positive correlation. A positive correlation occurs when both values go up (e.g., height and weight in children) and a negative correlation occurs when one value goes up as the other

goes down (e.g., days of drought and the size of a wheat crop). Correlations are rarely perfect and most correlation coefficients are values such as .67 or −.35. The problem then becomes one of evaluating strength of association from these values. Statistical significance is one way. The other way of evaluating strength is by considering amount of explained variance. The statistical significance of a correlation coefficient involves the determination at a particular confidence level as to whether or not the coefficient is different from zero. Typically, statistical analysis in the behavioral sciences utilizes the .05 (occurrence by chance 5 times out of 100) or .01 (1 time out of 100). Thus, working at the .05 level, a given correlation coefficient would be significant if it could occur by chance less than 5 out of 100 times. Nonsignificant correlations are sometimes referred to as zero-order correlations. The statistical significance of a single correlation coefficient is often considered to be a relatively trivial matter in most research, but particularly, when the purpose of the research is that of generating predictions from an unknown variable to a known variable, statistically significant correlation coefficients can have exceedingly low predictive value. Generally, a more important consideration is the percentage of explained variance. That percentage is the squared correlation coefficient. Thus, a correlation coefficient of .40 yields 16% explained variance. That is, 16% of the variance in the unknown variable can be accounted for by variance in the known variable. The remaining variance in the unknown variable has to be accounted for by unknown factors. Sometimes statistically significant correlations explain very little variance.