chapter  8
Transitioning NWChem to the Next Generation of Manycore Machines
ByEric J. Bylaska, Edoardo Aprà, Karol Kowalski, Mathias Jacquelin, Wibe A. de Jong, Abhinav Vishnu, Bruce Palmer, Jeff Daily, Tjerk P. Straatsma, Jeff R. Hammond, Michael Klemm
Pages 22

This chapter presents efforts to add thread-level parallelism to three core modules in North-west chemistry, including plane-wave density functional theory, tensor contraction engine, and the Car-Parrinello molecular dynamics (MD). Even though, the global array (GA)/message passing interfac model can be used on many of today current architectures with large numbers of cores, example, the second generation Intel Xeon Phi hardware named Knights Landing. Originally designed and implemented for single processor computers, refactoring was needed to take advantage of vector processors, then parallel clusters, and now for massively parallel hybrid systems with many core and massively threaded accelerators. All implementations of MD on massively parallel machines use domain decomposition to distribute the atomic data over the available processes. The basic MD process consists of evaluation of atomic Newtonian forces and using these and the atomic velocities to advance atomic coordinates. The GA parallel library is undergoing a major rewrite to make it completely thread safe.