ABSTRACT

The subject of this chapter is the design and analysis of parallel algorithms.Most of today’s algorithms are sequential, that is, they specify a sequence of steps inwhich each step consists of a single operation. These algorithms are well suited to today’s computers, which basically perform operations in a sequential fashion. Although the speed at which sequential computers operate has been improving at an exponential rate for many years, the improvement is now coming at greater and greater cost. As a consequence, researchers have sought more cost-effective improvements by building “parallel” computers-computers that performmultiple operations in a single step. In order to solve a problem efficiently on a parallel computer, it is usually necessary to design an algorithm that specifiesmultiple operations on each step, i.e., a parallel algorithm. As an example, consider the problem of computing the sum of a sequence A of n numbers.

The standard algorithm computes the sum by making a single pass through the sequence, keeping

a running sum of the numbers seen so far. It is not difficult however, to devise an algorithm for computing the sum that performsmany operations in parallel. For example, suppose that, in parallel, each element of A with an even index is paired and summed with the next element of A, which has an odd index, i.e., A[0] is paired with A[1], A[2] with A[3], and so on. The result is a new sequence of n/2 numbers that sum to the same value as the sum that we wish to compute. This pairing and summing step can be repeated until, after log2 n steps, a sequence consisting of a single value is produced, and this value is equal to the final sum. The parallelism in an algorithm can yield improved performance onmany different kinds of com-

puters. For example, on a parallel computer, the operations in a parallel algorithm can be performed simultaneously by different processors. Furthermore, even on a single-processor computer the parallelism in an algorithm can be exploited by usingmultiple functional units, pipelined functional units, or pipelined memory systems. Thus, it is important to make a distinction between the parallelism in an algorithm and the ability of any particular computer to perform multiple operations in parallel. Of course, in order for a parallel algorithm to run efficiently on any type of computer, the algorithm must contain at least as much parallelism as the computer, for otherwise resources would be left idle. Unfortunately, the converse does not always hold: some parallel computers cannot efficiently execute all algorithms, even if the algorithms contain a great deal of parallelism. Experience has shown that it is more difficult to build a general-purpose parallel computer than a general-purpose sequential computer. The remainder of this chapter consists of nine sections. We begin in Section 25.2 with a discus-

sion of how to model parallel computers. Next, in Section 25.3 we cover some general techniques that have proven useful in the design of parallel algorithms. Sections 25.4 through 25.8 present algorithms for solving problems from different domains. We conclude in Section 25.9 with a discussion of current research topics, a collection of defining terms, and finally sources for further information. Throughout this chapter,weassume that the readerhas some familiaritywith sequential algorithms

and asymptotic notation and analysis.