AutoFS: Automated Feature Selection via Diversity-aware Interactive Reinforcement Learning

2020 
In this paper, we study the problem of balancing effectiveness and efficiency in automated feature selection. Feature selection is to find the optimal feature subset from large-scale feature space, and is a fundamental intelligence for machine learning and predictive analysis. After exploring many feature selection methods, we observe a computational dilemma: 1) traditional feature selection methods (e.g., K-Best, decision tree based ranking, mRMR) are mostly efficient, but difficult to identify the best subset; 2) the emerging reinforced feature selection methods automatically navigate feature space to explore the best subset, but are usually inefficient. Are automation and efficiency always apart from each other? Can we bridge the gap between effectiveness and efficiency under automation? Motivated by such a computational dilemma, this study is to develop a novel feature space navigation method. To that end, we propose an Interactive Reinforced Feature Selection (IRFS) framework that guides agents by not just self-exploration experience, but also diverse external skilled trainers to accelerate learning for feature exploration. Specifically, we formulate the feature selection problem into an interactive reinforcement learning framework. In this framework, we first model two trainers skilled at different searching strategies: (1) KBest based trainer; (2) Decision Tree based trainer. We then develop two strategies: (1) to identify assertive and hesitant agents to diversify agent training, and (2) to enable the two trainers to take the teaching role in different stages to fuse the experience of the trainers and diversify teaching process. Such a hybrid teaching strategy can help agents to learn broader knowledge, and thereafter be more effective. Finally, we present extensive experiments on real-world datasets to demonstrate the improved performances of our method: more efficient than reinforced selection and more effective than classic feature selection.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    24
    References
    4
    Citations
    NaN
    KQI
    []