Classwise Clustering for Classification of Imbalanced Text Data

2019 
In this paper, the problem of classification of imbalanced text data is addressed. Initially, imbalanceness present across the classes is reduced by converting each class into multiple smaller subclasses. Further, each document is represented in a lower-dimensional space of size equal to the number of subclasses using term-class relevance (TCR) measure-based transformation technique. Then, each subclass is represented in the form of an interval-valued feature vector to achieve the compactness and stored in a knowledgebase. A symbolic classifier has been effectively used for the classification of unlabeled text documents. Experiments are conducted on Reuters-21578 and TDT2 text datasets. The results reveal that the performance of the proposed method is better than the other existing methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    28
    References
    4
    Citations
    NaN
    KQI
    []