ABSTRACT

This chapter aims to discuss variables of importance for designing, collecting, and transcribing corpora by drawing on Learner Corpus Research and Second Language Acquisition expertise in order to facilitate the sharing and use of large sets of learner language. Additionally, the term second language is likely erroneous for a majority of individuals contributing to learner corpora as many are learners/users of multiple languages. Therefore, researchers should also gather data concerning knowledge of all other languages used or studied by the learner, namely, mother tongue(s), home language(s), instructed additional language(s), extensive living-abroad experiences. In designing a corpus, careful selection, documentation, and justification of all criteria will increase the likelihood that the resulting corpus is methodologically-sound. To ensure a balanced representation of learner interlanguage as genre dictates usage, learner corpora should also include texts across genres to represent writing/speaking requirements across the fields of study.