Semantic-based Keyword Extraction Method for Document

2015 
Keyword extraction is one of the most important contents of information retrieval research. Document keywords extraction on the basis of semantic is an effective way to improve the accuracy of automatic extraction. This paper proposed a semantic-base keywords extraction method with the Chinese document as the processing object. First, semantic distances between words are calculated through the synonyms dictionary. Then theme related classes are obtained by density based clustering of words. Finally, the headwords are selected from topic related classes and regarded as keywords. An artificial contrast experiment, a corpus classification experiment and a scoring experiment were conducted. Results show that the proposed semantic-based keyword extraction method has higher accuracy and recall rate, and the extracted keywords are more related to the topic.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []