ABSTRACT

Although fully automated software-aided transcription will be a reality at some point in the not so distant future, current research projects rely on human transcription and annotation of language. This chapter describes transcriptions and coding by humans, with some remarks on software-aided transcription. From a linguistic perspective, there are at least two main types of transcription: orthographic and prosodic. Orthographic transcription of spoken language involves an act of interpretation that needs to be acknowledged and reflected upon. Prosodic transcription adds prosodic marking to orthographic transcripts (i.e. intonation). Depending on the scope of the research project and the resources available, one may want to adopt different strategies towards annotation. There are many more types of annotation: syntactic annotation (parsing), semantic annotation, pragmatic annotation, discourse annotation, stylistic annotation, lexical annotation, etc. Parsing and semantic annotation can be carried out automatically.