Locally Linear Support Vector Machines for Imbalanced Data Classification

2021 
Classification of imbalanced data is one of most challenging aspects of machine learning. Despite over two decades of progress there is still a need for developing new techniques capable to overcome numerous difficulties embedded in the nature of imbalanced datasets. In this paper, we propose Locally Linear Support Vector Machines (LL-SVMs) for effectively handling imbalanced datasets. LL-SVMs is a lazy learning approach which trains a local classifier for each new test instance using its k nearest neighbors. This way, we are able to maximize the margin in the original input features space and obtain a better adaptation to complex class boundaries. We combine LL-SVMs with local oversampling and cost-sensitive approaches to make them skew-insensitive. Working only in the local neighborhood significantly improves the generalization over the minority class and tackles instance-level difficulties, such as class overlapping, borderline and noisy instances, as well as small disjuncts. An extensive experimental study shows that our local models are able to outperform their global counterparts, especially when handling difficult, borderline, and noisy imbalanced datasets.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    0
    Citations
    NaN
    KQI
    []