ABSTRACT

This chapter introduces the second edition of the Routledge Handbook of obviated by the capacity to store vastCorpus Linguistics. It reflects on advances in the field since 2010. The chapter notes that the decade has seen major changes in the availability of corpora, with cloud-based storage making available multi-billion-word corpora. The rise in rapid-response corpora is noted. These can be assembled to respond to pressing political and social issues (e.g. Brexit and COVID-19). Such corpora allow researchers to track societal thought processes around these major societal events. The chapter also flags the changes in text types, with the proliferation of social media content. The authors query whether corpus tools are ready for such multi-modalities. Learner corpora are also highlighted as being in a phase of change with a “profiling turn” emerging. This is linked to a greater synergy between corpus linguistics and second language acquisition. The chapter ends with a look back to the ancient monastic origins of concordancing and notes that now our software can replicate the work of 500 monks in nanoseconds. The chapter ends with a caveat: amid the richness of ever-growing data and ever-advancing tools, there are a risk and a responsibility to safeguard the tenets of principled sampling, corpus design and representativeness.