Efficient disambiguation by means of stochastic tree-substitution grammars

doi:10.4324/9781315072685-17

ABSTRACT

Natural language grammars often assign many syntactic structures to the same sentence. Most of these structures are perceived as implausible by a human language user. At present, many natural language models employ statistical information about natural language use in order to approach this human ability. Some models assume a context-free grammar (CFG) in which each rule receives an unconditional application probability, e.g. Fujisaki et al. (1989), Jelinek et al. (1990). Needless to say, “context-free” ² probabilities are not sufficient for natural language syntactic disambiguation that is “context-sensitive” in its nature. In contrast, other models (Schabes 1992) employ “mildly” context-sensitive grammars at the cost of computational efficiency. Somewhere in between one finds those models that parse only the context-free languages but still assume context-sensitive application probabilities, e.g. Magerman & Weir (1992), Bod (1992), Carroll & Briscoe (1992).