Helping term sense disambiguation with active learning

2015 
Our research highlights the problem of term polysemy within terminometrics stud- ies. Terminometrics is the measure of term usage in specialized communication. Pol- ysemy, especially within single-word terms as we will show, prevents using term cor- pus frequencies as appropriate statistics for terminometrics. Automatic term sense dis- ambiguation, as a possible solution, re- quires human annotation to feed a super- vised learning algorithm. Within our experi- ments, we show that although being polyse- mous, terms have a strong in-domain sense bias, making random sampling of annota- tion data less than optimal. We suggest the use of active learning and implement it within an annotation platform as a way of reducing annotation time.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    13
    References
    0
    Citations
    NaN
    KQI
    []