ABSTRACT

This chapter discusses the evolution of natural language processing (NLP) approaches to text representation and how different ways of representing text can be utilized for a relatively understudied task in educational assessment – that of predicting item characteristics from item text. The first part of the chapter gives an introductory overview of the transition from hypothesis-driven linguistic features to non-contextualized and then contextualized embeddings. This overview is intended for assessment professionals who do not have a background in NLP. The second part demonstrates how these approaches could be applied to predicting item difficulty, response time, and item biserial for a set of clinical multiple-choice questions from a high-stakes licensing exam. These items are written to a common reading level so that they differ only in the difficulty of the construct they measure (i.e., clinical knowledge). The chapter concludes by discussing practical considerations for developing such models (e.g., the role of training data), as well as the implications of model interpretability and the use of the predictions in the context of high-stakes assessment.