Improvement in Boosting Method by Using RUSTBoost Technique for Class Imbalanced Data

2019 
Class imbalance problem is common in many fields, and it occurs due to imbalanced dataset. A dataset is considered as imbalanced when number of examples in one class is more or less compared to another class. Data mining algorithms may generate suboptimal classification models when trained with imbalanced datasets. Several techniques have been proposed to solve the class imbalance problem. One of them includes boosting which is combined with resampling technique. It has gained popularity to solve class imbalance problem, for example, Random Undersampling Boosting (RUSBoost) and Synthetic Minority Oversampling Boosting Technique (SMOTEBoost). RUSBoost method uses random undersampling technique as resampling technique. One of the disadvantages of random undersampling may include loss of important data which is overcome by redundancy-driven modified Tomek-link based undersampling. A new hybrid undersampling algorithm is proposed in which we use redundancy-driven modified Tomek-link based undersampling as resampling technique along with boosting for learning from imbalanced training data. Experiments are performed for various datasets which are related to various application domains. The results are compared with decision tree, Support Vector Machine (SVM), logistic regression, and K-Nearest Neighbor (KNN) to check the performance of proposed method.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    15
    References
    4
    Citations
    NaN
    KQI
    []