Rough set based semi-supervised feature selection via ensemble selector

2019 
Abstract Similar to feature selection over completely labeled data, the aim of feature selection over partially labeled data (semi-supervised feature selection) is also to find a feature subset which satisfies the intended constraint. Nevertheless, two difficulties may emerge in the semi-supervised feature selection: (1) labels are incomplete since labeled and unlabeled samples coexist in data; (2) the explanation of the selected feature subset is not clear. Therefore, such two problems will be mainly addressed in our research. Firstly, the unlabeled samples can be predicted through various semi-supervised learning methods. Secondly, the Local Neighborhood Decision Error Rate is proposed to construct multiple fitness functions for evaluating the significance of the candidate feature. Such mechanism not only realizes the ensemble selector in the process of feature selection, but also the qualified feature subset will bring us lower decision errors. Immediately, a heuristic algorithm is re-designed to execute feature selection. Finally, through testing nine different ratios (10%, 20%, … , 90%) of labeled samples in data, the experimental results demonstrate that our approach is superior to previous researches, mainly because: (1) the qualified feature subset derived by our approach can provide better classification performance; (2) the lower time consumption is required in our process of feature selection.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    63
    References
    40
    Citations
    NaN
    KQI
    []