ABSTRACT

Department of Computer Science, University of Illinois at Urbana-Champaign

Filippo Gioachin

Hewlett-Packard Laboratories Singapore

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 6.2 Code Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

6.2.1 Domain Decomposition and Load Balancing . . . . . . . . . . . . 108 6.2.2 Tree Building . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 6.2.3 Tree Walking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 6.2.4 Force Softening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 6.2.5 Periodic Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . 114 6.2.6 Neighbor Finding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 6.2.7 Multi-Stepping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

6.3 Accuracy Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 6.3.1 Force Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 6.3.2 Cosmology Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

6.4 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 6.4.1 Domain Decomposition and Tree Build Performance . . . 121 6.4.2 Single-Stepping Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 6.4.3 Multi-Stepping Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 6.4.4 ChaNGa on GPUs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

6.4.4.1 Adaptations for GPU Architectures . . . . . . . 127 6.4.4.2 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

6.5 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

Parallel Approach

It is remarkable that cosmology is now a “precision” science. The age of the Universe is known to about a percent; the relative amounts of baryons, dark matter and dark energy are known to a few percent; the expansion rate is also known to a few percent. This is in spite of the fact that we know very little of the nature of the main constituents of the Universe: dark matter and dark energy. What we do know is that they gravitate, and gravity is by far the dominant force on astronomical scales. Hence, building models of large scale structure and making testable predictions is relatively straightforward: the matter is represented by a number of particles, and the motion of these particles is followed under their mutual gravitational forces, a technique referred to as an N-body simulation. Noted successes in cosmological model building include predicting that dark matter around disk galaxies are more spherical than the light distribution [193], and ruling out light neutrinos as a major component of the dark matter [245].