A graph-based semantic relatedness assessment method combining wikipedia features

2017 
Abstract Semantic relatedness assessment between concepts is a critical issue in many domains such as artificial intelligence, information retrieval, psychology, biology, linguistics and cognitive science. Therefore, several methods assess relatedness by exploiting knowledge bases to express the semantics of concepts. However, there are some limitations such as high-dimensional space, high-computational complexity, fitting non-dynamic domains. Considering that Wikipedia, a domain-independent encyclopedic repository, which provides very large coverage, has been exploited by many methods as a huge semantic resource. In this paper, we propose a novel graph-based relatedness assessment method using Wikipedia features to avoid some of the limitations and drawbacks mentioned above. Firstly, for each term in a word pair, the top k most relevant Wikipedia concepts are returned by the Naive-ESA algorithm to reduce the dimensional space of Explicit Semantic Analysis (ESA) method. Secondly, for each different candidate concept in two relevant concept sets, we collect its categories set from the Wikipedia Category Graph (WCG). Based on the categories in WCG network, the relatedness between concepts at the correspondence position of the two sorted concept sets is computed as the association coefficient. Thirdly, based on this parameter, a novel relatedness assessment metric is presented. The evaluation is performed on some datasets well-recognized as benchmarks, using several widely used metrics and a new metric designed by ourselves. The result demonstrates that our method has a better correlation with the intuitions of human judgments than other related works.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    70
    References
    11
    Citations
    NaN
    KQI
    []