ABSTRACT

Markup grammars such as Structured Generalised Markup Language (SGML) and eXtensible Markup Language (XML) are important because they can be used to define customised tagsets, making it possible for people to embed additional knowledge in the text, including interpretive material. The purpose of text tagging is to facilitate retrieval and representation through applying what is essentially a controlled vocabulary of tags. A collection with an interpretive level of tagging is one where information is included in the tags that is otherwise not available in the text. Examples include regularisation of people’s names, and specifics of dates and locations that are only mentioned in general terms in the text (for instance, “yesterday” in a letter where the exact date can be ascertained). The presence of tagging in a collection provides an opportunity for designers to make the tagged material visible to the users of the collection, in ways that will provide greater prospect and all its related advantages.