Online Streaming Feature Selection Based on Feature Interaction

2020 
In many big data applications, online streaming feature selection plays a critical role in processing feature stream and dealing with high-dimensional problems. However, traditional online streaming feature selection methods focus on relevant features, irrelevant and/or redundant features, ignore the interaction between features. i.e., individual feature and label are irrelevant or weakly correlated, but when it is combined with another irrelevant or weakly feature, they show strongly correlated with label. In this paper, we propose a novel feature selection algorithm that considers feature interaction based on neighborhood rough set. This algorithm select features based on the following principles: the discrimination capability of the selected feature subset should be greater than or equal to the original feature space, and the number of features subset should be as small as possible by using feature interaction. Under this framework, we propose an online significance analysis criterion to select significance features relative to the currently selected features, and design an online redundancy analysis criterion to retain highly interactive features and filter out redundant features. Experimental results on a series of benchmark datasets show that the proposed algorithm significantly outperforms other state-of-the-art online streaming feature selection methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    20
    References
    2
    Citations
    NaN
    KQI
    []