ABSTRACT

This chapter presents a study which merges two distinct topics in corpus linguistics: automated Natural Language Processing (NLP) and morphology complexity. NLP refers to using computational methods to automatically calculate the appearance of linguistic features in a text, typically in an efficient and large-scale fashion. The linguistic features range from type–token ratios and average word frequencies to the number of dependent clauses and the degree of semantic similarity across paragraphs in a text. Morphological features of language have historically been seen in linguistics as a gateway to understanding implicit knowledge about a language. Recent research in NLP has provided linguistic analysis innovations in the form of powerful and efficient automatic text analysis tools. These tools have been used to measure linguistic features which are too fine-grained or too numerous to feasibly count by hand.