ABSTRACT

Thanks to the increasing transistor density found in integrated circuits due to Moore’s Law, graphics processing units (GPUs), the Cell Broadband En-

gine (CBE), and traditional microprocessors with increasingly high levels of core-level parallelism are now state-of-the-art commodity components. Their computational throughput, flexibility, and power efficiency is such that designing systems with these components is the best way to reach the scientific and biomedical community’s ever increasing need for processing cycles. Indeed, Los Alamos National Laboratory’s Cell-accelerated Roadrunner cluster was the first supercomputer to break the petaflops barrier, and held the title of the fastest supercomputer in the world until just recently, losing it to Oak Ridge National Laboratory’s Jaguar cluster.