ABSTRACT

Data and code are becoming as important to research dissemination as the traditional manuscript. For computational science, the evidence is clear: it is typically impossible to verify scientific claims without access to the code and data that generated published findings. Gentleman and Lang [1] introduced the notion of the “research compendium” as the unit of scholarly communication, a triple including the explanatory narrative, the code, and the data used in deriving the results. One of the reasons for including the code and data is to facilitate the production of really reproducible research, a phrase coined by Jon Claerbout in 1991 to mean research results that can 326be regenerated from the available code and data. Claerbout’s approach was paraphrased by Donoho and Buckheit [2] as follows:

The idea is: An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and the complete set of instructions which generated the figures.