ABSTRACT

The ability to identify indices of semantic similarity among words is an important problem which has important applications in the fields of both artificial intelligence and experimental psychology. Unfortunately, the problem of rank-ordering words according to their semantic similarity measure is not a straightforward one even with powerful semantic network models of the lexicon such as WORDNET (Fellbaum, 1998). The problem arises because “word-meaning equivalence” is not some objective quantity but is a subjective context-dependent property. Two words that seem very different in one context might be judged semantically similar within another context (and vice- versa). For example, a model of the lexicon such as WORDNET would predict the superordinates of CUP: “CROCKERY” or “DISHWARE” to be more consistent with the meaning of the word “CUP” than the subordinates of “CUP”: “TEA CUP” or “COFFEE CUP”. But such predictions might not be consistent with human performance. That is, people might simply be more likely to use the phrase “COFFEE CUP” instead of the word “CROCKERY” when they want to express the meaning “CUP”. The goal of this research is to compare the standard semantic similarity measure of distance in WORDNET which is based upon the number of links separating two words in WORDNET with two new algorithms for computing semantic distance.