A novel Semi-Supervised Ensemble algorithm using a Performance-Based Selection metric to non-stationary data streams

2021 
Abstract In this article, we consider the semi-supervised data stream classification problems. Most of the semi-supervised learning algorithms suffer from a proper selection metric to select from the newly-labeled data points through the training procedure. These approaches mainly employ the probability estimation of the underlying base learners to their predictions as a selection metric, which is not optimal in many cases. Handling different kinds of concept drifts is another issue in data streams. Considering these issues, we propose a novel Semi-Supervised Ensemble algorithm using a Performance-Based Selection metric to data streams, named SSE-PBS. The proposed selection metric is based on a pseudo-accuracy and energy regularization factor. We show that SSE-PBS improves classification performance and handles different kinds of concept drifts. The proposed algorithm can also employ any kind of incremental base learners. In the experiments, we report the results of two base learners on synthetic and real-world datasets. The experiments show that SSE-PBS significantly improves the classification performance of the used underlying base learners. Furthermore, we compare the results to the state-of-the-art supervised and semi-supervised approaches in data streams. The results further show that SSE-PBS outperforms the other methods when there is a small portion of labeled instances.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    70
    References
    2
    Citations
    NaN
    KQI
    []