ABSTRACT

This chapter discusses hardware support for implementing synchronization primitives. From the point of view of software, synchronization primitives that are widely used are locks, barriers, and point-to-point synchronizations such as signal-wait pairs. There are many ways of supporting synchronization primitives in hardware. A common practice today is to implement the lowest level synchronization primitive in the form of atomic instructions in hardware, and implement all other synchronization primitives on top of that in software. The chapter also discusses the trade-offs of various implementations of locks and barriers. It shows that achieving fast but scalable synchronization is not trivial. Often there are trade-offs in scalability versus uncontended latency. The chapter describes several software barrier implementations. For very large systems, software barriers implemented on top of an atomic instruction and lock may not give sufficient scalability.