ABSTRACT

Karl Pearson described spurious correlation in 1897. If the values of three variables, x, y, and z, are selected randomly from a distribution, then on average we expect no correlation of x with y, x with z, or y with z. But if we divide the values of both x and y by z, the ratios x/z and y/z will be correlated. Pearson called this a “spurious” relationship. Since incidence rates are event counts divided by person-time, might spurious correlation arise if independent variables are used that are also divided by person-time? Jerzy Neyman discussed this topic in 1947, using hypothetical data suggesting that storks bring babies. Questions about Neyman’s example are raised to motivate readers.