Causal Inference and Extraneous Factors: Confounding and Interaction |

ABSTRACT

This is the time to temper the exuberance of the last few chapters and catch our breath for a moment. We seem to have solved many problems: we have learned how to sample populations to collect useful information about the relationship between a risk factor and a disease outcome using a variety of measures of association; we have discussed how to assess whether an observed association may be simply due to chance and, of more importance, discussed methods to quantify both the association and the uncertainty surrounding its estimation. But what do these procedures really tell us about how exposure influences the risk of disease in practice? I am reminded here of the scene in The Sound of Music when the children, having learned to sing by merely using the names of the notes, complain that the song, however beautiful, doesn’t mean anything! Notice that the statistical techniques we have introduced pay no attention to the nature of the exposure and disease variables. At the extreme, the methods do not even acknowledge whether E occurs before D in chronological time or not. If we define smoking to be our outcome D and having a lung tumor to be our exposure E , are we possibly establishing that lung cancer is a risk factor for smoking? In Section 7.1.3, why is the possibility that diagnosis with pancreatic cancer leads to increased coffee consumption not an equally plausible interpretation of our calculations? The point is that the statistical tools of Chapter 7 do not, in and of themselves, differentiate between the possibility that E causes D or D causes E .