ABSTRACT

Several specific linguistic features that are thought to contribute to a rater’s implementation of the generic descriptions were examined by A. Brown, N. Iwashita, and T. McNamara, a foundational study that was influential in determining the design of the TOEFL iBT Speaking scoring rubrics. While the content of responses to the TOEFL iBT integrated speaking tasks is constrained to a certain degree by the material presented in the stimulus and by the prompt question, test-takers’ responses exhibit a wide range of variation in the specific ideas that are conveyed, the order in which they are conveyed, and the linguistic components that are used to convey them. In order to address the issues stemming from insufficient coverage of the range of diversity in test-taker responses with reference-based features, another approach to automated content scoring compares test-taker responses to previously obtained test-taker responses that have been scored by human raters.