ABSTRACT

The frequentist approach to statistical inference forbids respecifying the empirical model to achieve specific outcomes such as statistical significance. Otherwise, the p-values of the applied tests become invalid. Nonetheless, many researchers use the degrees of freedom they have available to search for statistical significance, since they are not able to publish insignificant findings. The chapter surveys the different manifestations of this problem, such as p-hacking, forking, and fishing, which it ranks according to their consequences for the literature. The chapter further lays out how researchers p-hack in practice. It explains in a separate section why p-hacking is so problematic through the logic of the multiple comparisons problem. The chapter also addresses HARKing, which arises if researchers have flexibility in posing research questions, so they can reverse engineer their theoretical arguments to fit the empirical patterns they have detected in their data. The chapter closes by arguing that we may have to abandon designating findings in a dichotomous way as either statistically significant or insignificant, and rather set focus on the overall quality of the study itself, irrespective of its p-value.