DETERMINATION OF AN OPTIMAL PIPELINE FOR IMBALANCED CLASSIFICATION: PREDICTING POTENTIAL CUSTOMER COMPLAINTS TO A TEXTILE MANUFACTURER

Ssu-Han Chen,Wei-Hsin Lin

DETERMINATION OF AN OPTIMAL PIPELINE FOR IMBALANCED CLASSIFICATION: PREDICTING POTENTIAL CUSTOMER COMPLAINTS TO A TEXTILE MANUFACTURER

2021

There is an urgent need to reduce customer complaints because they damage reputations and incur losses. This study predicts the likelihood of complaint about a new production order using its intrinsic features. Customer complaints, however, are relatively rare, creating a serious class-imbalanced problem when training a classifier. To overcome this problem, we use a pipeline including the upsampling, the hyper-parameter generation, the classifier, and the evaluation metric. As each strategy involves different tricks in the pipeline, we use the design of experiments (DOE) concept to find, automatically, a suitable combination. A multi-response DOE is used to maximize balanced accuracy and minimize overfitting during training. The experimental results showed that the balanced accuracy of the proposed method for the testing dataset was about 23.6% better than those of the base classifiers and about 7% better than those of the current state-of-the-art methods.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations