Hierarchical Comprehensive Context Modeling for Chinese Text Classification

2019 
The Chinese text classification task is challenging compared to tasks based on other languages such as English due to the characteristics of the Chinese text itself. In recent years, some popular methods based on deep learning have been used for text classification, such as the convolutional neural network (CNN) and the long short-term memory (LSTM) network. However, some problems are still encountered when classifying Chinese text. For example, important but obscure context information in Chinese text is not easily extracted. To improve the effect of Chinese text classification, we propose a novel classification model in this paper named the hierarchical comprehensive context modeling network (HCCMN) that can extract more comprehensive context. Our approach aims to extract contextual information and integrate it with the original input and then extract hierarchically more context, spatial information and high-weight local features from the integrated results. In addition, our method can remember long-term historical obscure information. Since Chinese radiology texts are complicated and difficult to obtain, we collected a Chinese radiology medical text dataset (CIRTEXT) containing more than 56,000 real-world data samples to verify the effect of this work. We conducted experiments on four datasets and showed that our HCCMN performs at state-of-the-art levels on three selected evaluation metrics compared to baselines. We present promising results showing that our hierarchical context modeling network extracts useful context from Chinese text more effectively and comprehensively.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    20
    References
    4
    Citations
    NaN
    KQI
    []