ABSTRACT

Part III of this volume, “Working with and Preserving Existing Data,” explores the issues and challenges associated with adapting existing data to the needs of sociolinguists. Data treatment, in other words. It is perhaps useful to consider why sociolinguists, especially variationists, might be more willing and able than other researchers to work with existing data. Variationists’ traditional methods of data collection and analysis actually predispose us to a two-step process: first working to collect as naturalistic data as possible, often through the sociolinguistic interview, followed by a close reading of the resulting materials to decide what linguistic variables might best lend themselves to analysis and discussion (see, for example, Wolfram, 1993). This means that much of our data collection is blind to eventual purpose. From there, it is one small step to using data that were not collected for sociolinguistic reasons at all. There are exceptions to this, obviously, since the earliest days of the field: word lists, read passages, Labov’s department store study and its Rapid and Anonymous Surveys (Labov, 1966). Usually, though, sociolinguistic interviews are seen as the gold standard, in large part because they are intended to draw respondents’ attention away from the recording process, to access their vernacular. What is it about recordings generally that encourages interviewees to avoid vernacular speech? The microphone and recorder? The act of being recorded? Traits of the interviewer (linguist, academic, stranger)? The interviewee’s knowledge of the goals of the researcher? Techniques like the danger-of-death question and linguistic modules are designed to overcome such problems, but, to some extent, data from other sources avoids them by not introducing them in the first place. Once we decide that all the data world is our research stage, certain questions arise:

1. What are “data”? The world is full of linguistic material these days, thanks largely to the internet, and it is extremely easy to access (and, in some ways, easy to do specific types of analysis . . . an issue perhaps beyond the point of this volume). At what point does material turn from “a bunch of words and stuff ” into something we can analyze?