1900 to 1930 | 7 | The History of Correlation

ABSTRACT

This chapter traces the slow progression of acceptance of the concept of mathematical correlation and the linear regression coefficient, as evidenced by published papers and textbooks. After that slow start, the number of new publications multiplied rapidly, and departments of mathematical statistics began forming in universities world-wide. Unfortunately, the authors of those papers and books did not all agree on what is correlation, what value it has, and how to calculate or estimate it. As a result of such disagreements among the factions involved, there were many published acrimonious attacks by one faction against another. Much of that disagreement can be traced to whether or not the authors were talking about controlled or uncontrolled studies. Virtually all the statistics textbooks published at the start of these decades were written by non-mathematicians (e.g., economists and biologists); as the years passed, more and more textbooks were written by mathematicians, and the explanations and definitions of correlations provided by them reflected that change. In the mid-1920s, multiple people independently worked at designing machines that could calculate correlation coefficients for very large data sets. The “coefficient of determination” was published as a more valid method of measuring the strength of correlation. An unfortunate practice was promoted that recommended using the correlation coefficient as a way to screen study results; why that is unfortunate is also discussed in this chapter.