Hyponym/Hypernym Detection in Science and Technology Thesauri from Bibliographic Datasets

2017 
Thesauri for science and technology information are increasingly used in bibliometrics and scientometrics. However, the manual construction and maintenance of thesauri is costly and time consuming, thus, methods for semi-automatic construction and maintenance are being actively studied. We propose a method that expands an existing thesaurus with specified terms extracted from the abstracts of articles. Specifically, we assign the terms to specified subcategories by clustering a word vector space, then determine the hyponyms and hypernyms based on their relations with terms in the sub-categories. The word vectors are constructed from 177,000 IEEE articles archived from 2012 to 2014 in the Scopus dataset. In experiments, the terms were correctly classified into the Japan Science and Technology thesaurus with 70.8% precision and 75.4% recall. In future, we will develop a semiautomatic thesaurus maintenance system that recommends new terms in their proper relative positions.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    17
    References
    0
    Citations
    NaN
    KQI
    []