ABSTRACT

May Not Be Appropriate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 7.5.7 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268

7.6 Extremely Large Databases and SciDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 7.6.1 Differences between the Requirements of Scientific and

Commercial Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 7.6.2 The Array Data Model in SciDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269

7.6.2.1 Definition and Creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 7.6.2.2 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270

7.6.3 Data Overwrite and Provenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 7.6.4 Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 7.6.5 Storage Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272

Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272

Consider a high-energy physics experiment, where elementary particles are accelerated to nearly the speed of light and made to collide. These collisions generate a large number of additional particles. For each collision, called an event, about 1-10 MB of raw data are collected. The rate of these collisions is about 10 per second, corresponding with hundreds of millions or a few billion events per year. Such events are also generated by large-scale simulations. After the raw data are collected they undergo a reconstruction phase, where each event is analyzed to determine the particles it produced and to extract hundreds of summary properties (such as the total energy of the event, momentum, and number of particles of each type).