ABSTRACT

Resnik [10] was the first to consider the use of formula as bellow for the purpose of semantic similarity measurements:

cres ( )c log (p ) (1)

The ic value is then obtained by considering the negative log likelihood, where c is some concept in WordNet and p(c) is the probability of encounting c in a given corpus. To evaluate semantic similarity between two concepts, formally, the formula is as below:

sim c icres c s c res ( ,c ) ( )c

1 INTRODUCTION

The measurement of the semantic similarity between words contributes to the better understanding of textual resources. As a result, it has been applied in many different tasks such as FAQ system [1], document classification, and automatic language translation. According to Liu [2], the approach of semantic similarity can be roughly divided into two categories, including corpus-based measurements and ontology-based measurements. The latter measurements mainly use taxonomies as ontology to calculate its similarity. Such as [3] utilize WordNet to calculate the English words semantic similarity, Li feng [4] uses HowNet as a Chinese semantic dictionary to measure semantic similarity between words and concepts.