ABSTRACT

The need to employ appropriate affinity mechanisms has been stressed many times in this book: Cache size considerations, bandwidth bottlenecks, OpenMP parallelization overhead, ccNUMA locality, MPI intranode communication, and the performance of MPI/OpenMP hybrid codes are all influenced by the way that threads and processes are bound to the cores in a shared-memory system. In general, there are three different aspects to consider when trying to “do it all right.”