Classwise Clustering for Classification of Imbalanced Text Data

K. Swarnalatha,D. S. Guru,Basavaraj S. Anami,Mahamad Suhil

Classwise Clustering for Classification of Imbalanced Text Data

2019

K. Swarnalatha
D. S. Guru
Basavaraj S. Anami
Mahamad Suhil

In this paper, the problem of classification of imbalanced text data is addressed. Initially, imbalanceness present across the classes is reduced by converting each class into multiple smaller subclasses. Further, each document is represented in a lower-dimensional space of size equal to the number of subclasses using term-class relevance (TCR) measure-based transformation technique. Then, each subclass is represented in the form of an interval-valued feature vector to achieve the compactness and stored in a knowledgebase. A symbolic classifier has been effectively used for the classification of unlabeled text documents. Experiments are conducted on Reuters-21578 and TDT2 text datasets. The results reveal that the performance of the proposed method is better than the other existing methods.

Keywords:

Cluster analysis
Pattern recognition
Artificial intelligence
Computer science
Classifier (linguistics)
Hierarchical clustering
Compact space
Subclass
Feature vector

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations