ABSTRACT

Multicore chips are about to dramatically change software development. They are already everywhere; in fact, it is difficult to find PCs with a single, main processor. As of this writing, laptops come equipped with two to eight cores. Even smartphones and tablets contain multicore chips. Intel produces chips with 48 cores, Tilera with 100, and Nvidia’s graphical processor chips provide several hundred execution units. For major chip manufacturers, multicore has already passed single core in terms of volume shipment. The question for software developers is what to do with this embarrassment of riches. Ignoring multicore is not an option. One of the reasons is that single processor

performance is going to increase only marginally in the future; it might even decrease for lowering energy consumption. Thus, the habit of waiting for the next processor generation to increase application performance no longer works. Future increases of computing power will come from parallelism, and software developers need to embrace parallel programming rather than resist it. Why did this happen? The current sea change from sequential to parallel process-

ing is driven by the confluence of three events. The first event is the end of exponential growth in single processor performance. This event is caused by our inability to increase clock frequencies without increasing power dissipation. In the past, higher clock speeds could be compensated by lower supply voltages. Since this is no longer

possible, increasing clock speeds would exceed the few hundred watts per chip that can practically be dissipated in mass-market computers as well as the power available in battery-operated mobile devices. The second event is that parallelism internal to the architecture of a processor

has reached a point of diminishing returns. Deeper pipelines, instruction-level parallelism, and speculative execution appear to offer no opportunity to significantly improve performance. The third event is really a continuing trend: Moore’s law projecting an exponen-

tial growth in the number of transistors per chip continues to hold. The 2009 International Technology Roadmap for Semiconductors (https://www.itrs.net/Links/ 2009ITRS/Home2009.htm) expects this growth to continue for another 10 years; beyond that, fundamental limits of CMOS scaling may slow growth. The net result is that hardware designers are using the additional transistors to

provide additional cores, while keeping clock rates constant. Some of the extra processors may even be specialized, for example, for encryption, video processing, or graphics. Specialized processors are advantageous in that they provide more performance per watt than general-purpose CPUs. Not only will programmers have to deal with parallelism, but also with heterogeneous instruction sets on a single chip.