Self-Adaptive Word Segmentation Model in Military Domain Based on Conditional Random Field

2021 
To reinforce the adaptability of word segmentation model and make up for the inaccessibility of large-scale annotated corpus in military domain, we propose a self-adaptive word segmentation model in military domain based on conditional random field. The model adequately extracts the features of character, position and part-of-speech from texts to increase the accuracy of the general word segmentation model. Meanwhile, it introduces the boundary entropy and dictionary features, and realizes the self-adaptability of the word segmentation model in military domain, based on reverse maximum matching of word segmentation results using the military dictionary. The experiment shows when the training corpus is general and the testing corpus is military, the precision, recall and F1-measure value increased by 9.45%, 9.97% and 9.72%, respectively, compared with the CRF model in general domain, which effectively improves the self-adaptability of segmentation method in military domain.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    10
    References
    0
    Citations
    NaN
    KQI
    []