ABSTRACT

Scientific discoveries in the natural sciences are increasingly data driven and computationally intensive, providing unprecedented data analysis and scientific simulation opportunities. To accelerate scientific discovery through advanced computing and information technology, various research programs have been launched in recent years, for example, the SciDAC program by the Department of Energy1 and the Cyberinfrastructure initiative by the National Science Foundation,2 both in the United States. In the UK, the term e-Science3 was coined to describe computationally and data-intensive science, and a large e-Science research program was started there in 2000. With the new opportunities for scientists also come new challenges, for example, managing the enormous amounts of data generated4 and the increasingly sophisticated but also more complex computing environments provided by cluster computers and distributed grid environments. Scientific workflows aim to address many of these challenges.