ABSTRACT

Coherence is a concept that describes the flow of information from one part of a discourse to another. A number of approaches to modeling coherence have been developed, focusing on such factors as discourse modeling (e.g., Grosz, Joshi, & Weinstein, 1995; Mann & Thompson, 1988), the effects of coherence on comprehension (e.g., Kintsch, 1988, 1998; Lorch & O’Brien, 1995), and techniques for automated segmentation of discourse (e.g., Choi, Wiemer-Hastings & Moore, 2001; Hearst, 1997). All of these approaches must make certain decisions about what aspects of discourse are used in the modeling of coherence. Discourse coherence is composed of many aspects, ranging from lower level cohesive elements in discourse such as coreference, causal relationships, and connectives, up to higher level connections between the discourse and a reader’s mental representation of it. For all coherent discourse, however, a key feature is the subjective quality of the overlap and transitions of the meaning as it flows across the discourse. LSA provides an ability to model this quality of coherence and quantify it by measuring the semantic similarity of one section of text to the next.