ABSTRACT

After a decade where high-end computing was dominated by the rapid pace of improvements to CPU frequencies, the performance of next-generation supercomputers is increasingly differentiated by varying interconnect designs and levels of integration. Understanding the tradeoffs of these system designs is a key step towards making effective petascale computing a reality. In this work, we conduct an extensive performance evaluation of five key scientific application areas: plasma micro-turbulence, quantum chromodynamics, micro-finite-element solid mechanics, supernovae, and general relativistic astrophysics that use a variety of advanced computation methods, including adaptive mesh refinement, lattice topologies, particle in cell, and unstructured finite elements. Scalability results and analysis are presented on three current high-end HPC systems, the IBM Blue Gene/P at Argonne National Laboratory, the Cray XT4 and the Berkeley Laboratory’s NERSC Center, and an Intel Xeon cluster at Lawrence Livermore National Laboratory.1 In this chapter, we present each code as a section, where we describe the application, the parallelization strategies, and the primary results on each of the three platforms. Then we follow with a collective analysis of the codes performance and make concluding remarks.