ABSTRACT

Scott Klasky,1 Hasan Abbasi,2 Viraj Bhat,3 Ciprian Docan,3 Steve Hodson,1

Chen Jin,1 Jay Lofstead,2 Manish Parashar,3 Karsten Schwan,2 and Matthew Wolf2

1Oak Ridge National Laboratory 2Georgia Institute of Technology 3Rutgers, The State University of New Jersey

Contents

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 5.2 High-Performance Data Capture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

5.2.1 Asynchronous Capture of Typed Data . . . . . . . . . . . . . . . . . . . . . . . 155 5.2.2 DataTaps and DataTap Servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 5.2.3 High-Speed Asynchronous Data Extraction Using DART . . . 166 5.2.4 In-Transit Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

5.2.4.1 Structured Data Transport: EVPath . . . . . . . . . . . . . . . . 168 5.2.4.2 Data Workspaces and Augmentation

of Storage Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 5.2.4.3 Autonomic Data Movement Services Using

IQ-Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 5.3 Autonomic Services for Wide-Area and In-Transit Data . . . . . . . . . . . 171

5.3.1 An Infrastructure for Autonomic Data Streaming . . . . . . . . . . . 172 5.3.2 QoS Management at In-Transit Nodes . . . . . . . . . . . . . . . . . . . . . . . 175

5.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

In this chapter, we look at technology changes affecting scientists who run data-intensive simulations, particularly concerning the ways in which these computations are run and how the data they produce is analyzed. As computer systems and technology evolve, and as usage policy of supercomputers often permits very long runs, simulations are starting to run for over 24 hours and produce unprecedented amounts of data. Previously, data produced by

supercomputer applications was simply stored as files for subsequent analysis, sometimes days or weeks later. However, as the amount of the data becomes very large and/or the rates at which data is produced or consumed by supercomputers become very high, this approach no longer works, and high-throughput data movement techniques are needed.