CODES: Efficient Incremental Semi-Supervised Classification Over Drifting and Evolving Social Streams

2020 
Classification over data streams is a crucial task of explosive social stream mining and computing. Efficient learning techniques provide high-quality services in the aspect of content distribution and event browsing. Due to the concept drift and concept evolution in data streams, the classification performance degrades drastically over time. Many existing methods utilize supervised and unsupervised learning strategies. However, supervised strategies require labeled emerging records to update the classifiers, which is unfeasible to work in the practical social stream applications. Although unsupervised strategies are popularly applied to detect concept evolution, it takes tremendous run-time computation cost to run online clustering. To this end, in this paper, we address these major challenges of social stream classification by proposing an efficient incremental semi-supervised classification method named CODES (Classification Over Drifting and Evolving Stream). The proposed CODES method consists of an efficient incremental semi-supervised learning module and a dynamic novelty threshold update module. Thus, in the drifting and evolving social streams, CODES is able to provide: 1) semi-supervised learning ability to eliminate dependency on the labels of emerging records; 2) fast incremental learning with real-time update ability to tackle concept drift; 3) efficient novel class detection ability to tackle concept evolution. Extensive experiments are conducted on several real-world datasets. The results indicate a higher performance than several state-of-the-art methods. CODES achieves efficient learning performance over drifting and evolving social streams, which improves practical significance in the real-world social stream applications.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    5
    Citations
    NaN
    KQI
    []