This chapter considers how we might score samples of spoken language and performances that are elicited by the tasks used in a second language speaking test. In this chapter we attempt to give an overview of this work together with an account of the four main approaches to scale development that are documented in the literature.