ABSTRACT

Because Automated Essay Evaluation (AEE) systems are typically built to predict human ratings (e.g., Attali & Burstein, 2006; Dikli, 2006; Shermis & Burstein, 2003; and Chapters 2 and 4 in this book), the quality of the human ratings can affect the quality of the scores generated by the automated system. If the human raters are inconsistent and unreliable and/or if they overuse one or two points on the score scale, the machine cannot be effectively trained.