ABSTRACT

When an application runs across a massively parallel processor (MPP), numerous ineciencies can arise that may be caused by the structure of the application, interference with other applications running on the MPP system or ineciencies of the hardware and soware that makes up the MPP. e interference due to the other applications can be the source of nonreproducibility of runtimes. For example, all applications run best when they run on an MPP in a dedicated mode. As soon as a second application is introduced, either message passing and/or I/O from the second application could perturb the performance of the rst application by competing for the interconnect bandwidth. As an application is scaled to larger and larger processor counts, it becomes more sensitive to this potential interference. e application programmer should understand these issues and know what steps can be taken to minimize the impact of the ineciencies. e issues we cover in this chapter are

1. Topology of the interconnect and how its knowledge can be used to minimize interference from other applications

2. Interconnect characteristics and how they might impact runtimes

3. Operating system jitter

2.1 TOPOLOGY OF THE INTERCONNECT e topology of the interconnect dictates how the MPP nodes are connected together. ere are numerous interconnect topologies in use today. As MPP systems grow to larger and larger node counts, the cost of some interconnect topologies grows more than others. For example, a complete crossbar switch that connects every processor to every other processor becomes prohibitedly expensive as the number of processors increase. On the other hand, a two-or three-dimensional torus grows linearly with the number of nodes. All very large MPP systems such as IBM®’s Blue Gene® [5] and Cray’s XTTM [6] use a 3D torus. In this discussion, we will concentrate on the torus topology.