ABSTRACT

A brief survey of the literature suggests that there is anything but a consensus on the and related issues concerning the foundations of data science. The fundamental distinction between enumerative and eliminative induction is briefly introduced, the former focusing on the mere repetition of phenomena, the latter on the variation of phenomena. The chapter argues that eliminative induction provides a much more plausible and realistic picture of actual scientific practice. Moreover, an account of causation is outlined that corresponds to eliminative induction and that allows establishing the crucial distinction between relationships that are purely accidental and those that allow for prediction and manipulation. The methodological framework sketched in section "Causation" relies on the assumption of determinism, which certainly cannot be upheld for most applications of data science. The chapter sketches an objective, nonfrequency interpretation of probability that relies on symmetries in the causal structure of probabilistic phenomena to establish probability values.