ABSTRACT

In this chapter, the authors present their experience of working with data collected at various times and according to varying methodologies to create the Newcastle Electronic Corpus of Tyneside English (NECTE). The primary data behind NECTE were collected by two teams of sociolinguists, one working in the late 1960s and early 1970s on the Tyneside Linguistic Survey (TLS) and the other in the 1990s for the Phonological Variation and Change project. The authors focus on the challenges involved in processing the TLS data. The first challenge was to find as many of the data and accompanying metadata as possible. The majority of the data had been left in the department of Newcastle University where the TLS team had worked. The next challenge was compliance with 21st-century standards of ethics and data protection. The final challenge for the NECTE team was to “future-proof” the corpus.