ABSTRACT

This chapter centers on issues in the representation of entities as discourse unfolds, with a view to exploring and predicting signals for aforementionedness and other coreference phenomena. After discussing pertinent issues in entity and coreference annotation, the chapter explores three research questions: First, we address the question regarding how stable coreferentiality patterns are across genres, with results revealing some recurring patterns, but also considerable genre-based variation. Second, a case study in the visualization of coreferentiality patterns argues for a ‘distant reading’ approach to informing the identification of recurring patterns of repeated entity mentions. Finally, a multifactorial model is developed based on rich annotations of running text to answer the question whether we can predict, using only properties of a current entity mention, whether that entity is expected to have been mentioned earlier in the discourse.