ABSTRACT

This chapter examines the histories of two different measurement technologies: computerized adaptive testing (CAT) and automated essay scoring (AES) both of which were 'ideas' in the mid 1960s and had working models by the mid 1970s. It documents three studies, two involving prizes that examine the performance of machine scoring with that of trained human raters for both essay and short-answer performance assessments. ASAP included a public competition, in which $100,000 of cash rewards were offered to data scientists capable of producing new automated essay scoring (AES) machine scoring algorithms. Some of the winners from Automated Student Assessment Prize (ASAP) Phase I are already helping some of big testing companies improve their services. In Phase II, short-answer scoring, competitors were required to open source their code and provide instruction manuals with goal that others will continue to build on their successes. LightSIDE is an open-source scoring engine developed at Carnegie Mellon University and was included along with the commercial vendors.