ABSTRACT

Transcription enables analysis of spoken language. Its primary goal is to reproduce speech faithfully and consistently, creating an authentic representation of language in use. The details of transcription tend to receive more attention from qualitative sociolinguists, for whom what appears in a transcript both influences and constrains the generalizations that can be drawn. However, there is no standard protocol for sociolinguistic transcription. Every corpus is different, built for different purposes, to answer different questions, in different locales, with different demographics. A defining element of sociolinguistic corpora is their specialized nature, in that they are designed with a particular research question in mind. Therefore, no single decision can hold for all projects. Most decisions revolve around four themes: orthography, punctuation, phonetic detail, and spontaneous speech phenomena. Most researchers stress the need for standard orthographic conventions, but sometimes there is good reason to use non-standard spellings. One thing punctuation does affect is syntactic parsing.