ABSTRACT

This chapter introduces a case study using deep neural networks for language modeling aimed at automated item generation for medical certification applications. Natural language generation (NLG) using very large language models based on complex neural networks has become a mainstay in applications of natural language processing. The case study is presented using OpenAI’s gpt2 transformer language model which was fine-tuned using PubMed’s open access text mining database. The retraining was done using toolkits based on TensorFlow-GPU, available on GitHub, using a workstation equipped with two GPUs. In comparison to a study that used character-based recurrent neural networks trained on open access items, the retrained transformer architecture allows generating higher-quality text that can be used as draft input for medical education assessment material. In addition, prompted text generation can be used for production of distractors suitable for multiple-choice items used in certification exams. The chapter closes with a discussion of more recent language models and approaches that can be used instead of fine-tuning. The next generation of transformer-based language models allows using guided text generation by means of extensions such as InstructGPT, which is based on GPT3, which is the successor of the model used in this chapter.