ABSTRACT

When pronunciation is assessed for research purposes, listener-based numerical scales are commonly used to assign scores to speaker performances. The scales feature simple descriptors at each end, and a number of score points are marked in between. Evidence of the appropriateness of particular measurement techniques is crucial for the valid interpretation of L2 pronunciation scores. This chapter highlights important considerations in the assessment of L2 pronunciation with numerical scales, including scale function, rater variation, and difficulties associated with the rating task. These considerations are then illustrated with data from a study on L2 Korean pronunciation instruction. Many-facet Rasch measurement as an analytical tool for L2 pronunciation researchers is presented to investigate the issues of scale function, interval measurement, and rater differences in scale use. In L2 pronunciation research, the number of score points has typically included five, seven, or nine points the latter is perhaps most commonly used. In L2 pronunciation research.