ABSTRACT

This chapter focuses on the Cray Test and Development System running real-world high performance computing (HPC) applications in realistic settings. The accelerated Hierarchical Equation of Motion (HEOM) variant developed at Zuse Institute Berlin (ZIB) distributes the auxiliary density operators (ADOs) across compute nodes in addition to a parallelization on each compute device. Many HPC application codes implement generalized solutions for their respective domains. They typically provide a selection of algorithms that are used for different workloads or combinations of input sets. The accelerated HEOM variant developed at ZIB distributes the ADOs across compute nodes in addition to a parallelization on each compute device. Scaling offload applications beyond the limited compute performance and device memory provided by the accelerators within a single node requires the use of remote devices over fabric. The existence of many-core computer architectures results from the inevitable need to improve energy efficiency for floating point computations on the way to exascale machines within a feasible power budget.