ABSTRACT

This chapter concludes the book, summarizing main findings from previous chapters and offering some thoughts on the consequences of the recent explosion of multilayer corpus resources for corpus linguistics in particular and for linguistics at large. A connection is drawn between the rise of usage-based models in linguistics and the need for richly annotated empirical datasets, along with some remarks on important directions in making multilayer corpora reusable for the linguistic community, capitalizing on the possibility to reannotate existing resources with open licenses. Finally, we offer a discussion on the role of richly annotated data in the age of Deep Learning approaches to corpus-based computational linguistics. It is suggested that there is much to be gained from combining distributional semantics models trained on large, unannotated datasets, with richly annotated, smaller scale corpora, in state-of-the-art neural network approaches that are capable of combining insights gleaned from multiple corpus resources simultaneously.