Innovations in science and engineering are incrementally driven by intelligently making sense of massive datasets. Advanced simulations and experimental analyses in disciplines such as high-energy physics, climate modeling, astronomy, and life sciences require processing terabytes or even petabytes of data on a routine basis. As such, data-intensive scientific discovery has been identified as the fourth paradigm, as an addition to the traditional three scientific paradigms: experimental science, theoretical science, and computational science [HTT09].