ABSTRACT

You would need samples of lectures from across disciplines and in seminars/tutorials. A million words would lead to a small but probably effective sample as this would be fairly focused, although this would take a considerable amount of time to collect and transcribe. A much smaller sample could be used if the lectures were from a single subject discipline.