ABSTRACT

Distributed memory machines (DMMs) offer great promise because of their scalability and potential for enormous computational power. Yet, their widespread use has been hindered by the difficulty of parallel programming. Scientific programmers have had to write explicitly parallel code, and face many efficiency issues in deriving satisfactory performance. When a parallel program is run on a DMM, data need to be distributed among processors, with each processor having its own local address space; and explicit communication must be provided for the exchange of data among processors that is inherent in many programs. The processors in a DMM communicate by exchanging messages whenever a processor needs data that are located in some other processor’s local memory. This exchanging of data through software is commonly referred to in the literature as message passing and DMMs are usually known as message-passing computers. Local memory accesses on these machines are much faster than those involving interprocessor communication. Deciding when to insert messages in a program, thus implementing data and computation partitioning, and which partitioning of data is optimal are no easy tasks; and much effort has gone into developing ways to relieve the programmer from this burden.