ABSTRACT

Achieving extreme-scale computing requires ultra power-efficiency of the computing elements. Power efficiency is usually achieved by cutting transistor budget from hardware structures that exploit data locality such as caches. They are replaced with software-managed local-store to maintain performance. It can also require removing hardware structures that exploit instruction-level parallelism, such as outof-order execution units, relying on the support for vector execution units. Power efficiency generally leads to complicating software development. Heterogeneous systems provide a trade-off that combines complex processor cores with power-efficient accelerators to handle multiple code types.