Fast Maximum Entropy Machine for Big Imbalanced Datasets

2018 
Driven by the need of a plethora of machine learning applications, several attempts have been made at improving the performance of classifiers applied to imbalanced datasets. In this paper, we present a fast maximum entropy machine (MEM) combined with a synthetic minority over-sampling technique for handling binary classification problems with high imbalance ratios, large numbers of data samples, and medium/large numbers of features. A random Fourier feature representation of kernel functions and primal estimated sub-gradient solver for support vector machine (PEGASOS) are applied to speed up the classic MEM. Experiments have been conducted using various real datasets (including two China Mobile datasets and several other standard test datasets) with various configurations. The obtained results demonstrate that the proposed algorithm has extremely low complexity but an excellent overall classification performance (in terms of several widely used evaluation metrics) as compared to the classic MEM and some other state-of-the-art methods. The proposed algorithm is particularly valuable in big data applications owing to its significantly low computational complexity.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    33
    References
    0
    Citations
    NaN
    KQI
    []