Text Analysis and Natural Language Database Systems

doi:10.4324/9781003064060-18

ABSTRACT

Text analyses must ultimately be based on quantified, or quantifiable, representations of the information in verbatim texts. “Instance” is, in turn, modeled as a text block ID, an indicator of the instance’s thematic category, plus an indicator of its location in the text. The syntactic categories appropriate to a particular semantic text analysis will depend on the semantic grammar being used by the researcher. The node representing the phrase-triggered theme, Practical Emphasis, actually represents four occurrences of the theme in the text block. Although the network content model does have appeal in that it represents text as an interrelated whole into which domain knowledge can be incorporated, the lack of restrictions on the network structure, relative to the thematic and semantic content models, makes it more difficult to identify patterns in the network text representation systematically. In a relational database system the data model is always a table of tuples.