Validation and Confirmation of Associations

ABSTRACT

The advent of molecular epidemiology has resulted in a flurry of postulated associations.

The discovery of associations is continuously facilitated by the advent of more massive

and efficient platforms for measuring biological factors of interest. At the same time, this

has created an untamed plethora of postulated risk factors, only a fraction of which may

be true (sufficiently “credible” in a Bayesian framework). A survey of the published

literature shows that almost all epidemiological papers claim at least one finding to which

they attribute statistical significance. An empirical evaluation (1) showed that 87% of

epidemiological studies published in 2005 claimed at least one statistically significant

result in their abstracts. For some fields in molecular epidemiology, the situation is ever

more extreme (2). For example, in an empirical survey of 340 studies on cancer

prognostic factor studies that were included in meta-analyses and another 1575 articles on

cancer prognostic factor studies published in 2005, the proportion of articles that claimed

statistically significant prognostic effects in their abstracts was 90.6% and 95.8%,

respectively. Even among the few studies that did not claim statistically significant

prognostic effects, the majority either claimed statistically significant results for

something else or significant effects based on trends, or at least offered some “apologies”

that supported the probed associations based on some other external, qualitative, or

subjective evidence. Fully “negative” articles amounted only to 1.5% and 1.3% of the

articles, in the two data sets, respectively.