An Improved Algorithm for Imbalanced Data and Small Sample Size Classification
2015
Traditional classification algorithms perform not very well on imbalanced data sets and small sample size. To deal with the problem, a novel method is proposed to change the class distribution through adding virtual samples, which are generated by the windowed regression over-sampling (WRO) method. The proposed method WRO not only reflects the additive effects but also reflects the multiplicative effect between samples. A comparative study between the proposed method and other over-sampling methods such as synthetic minority over-sampling technique (SMOTE) and borderline over-sampling (BOS) on UCI datasets and Fourier transform infrared spectroscopy (FTIR) data set is provided. Experimental results show that the WRO method can achieve better performance than other methods.
Keywords:
- Sample size determination
- Support vector machine
- Statistical classification
- Multiplicative function
- Oversampling
- Fourier transform infrared spectroscopy
- Data set
- Machine learning
- Artificial intelligence
- Mathematics
- Pattern recognition
- improved algorithm
- imbalanced data
- Regression
- small sample
- multiplicative effect
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
19
References
10
Citations
NaN
KQI