TY - GEN
T1 - AutoFS
T2 - 20th IEEE International Conference on Data Mining, ICDM 2020
AU - Fan, Wei
AU - Liu, Kunpeng
AU - Liu, Hao
AU - Wang, Pengyang
AU - Ge, Yong
AU - Fu, Yanjie
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/11
Y1 - 2020/11
N2 - In this paper, we study the problem of balancing effectiveness and efficiency in automated feature selection. Feature selection is to find the optimal feature subset from large-scale feature space, and is a fundamental intelligence for machine learning and predictive analysis. After exploring many feature selection methods, we observe a computational dilemma: 1) traditional feature selection methods (e.g., K-Best, decision tree based ranking, mRMR) are mostly efficient, but difficult to identify the best subset; 2) the emerging reinforced feature selection methods automatically navigate feature space to explore the best subset, but are usually inefficient. Are automation and efficiency always apart from each other? Can we bridge the gap between effectiveness and efficiency under automation? Motivated by such a computational dilemma, this study is to develop a novel feature space navigation method. To that end, we propose an Interactive Reinforced Feature Selection (IRFS) framework that guides agents by not just self-exploration experience, but also diverse external skilled trainers to accelerate learning for feature exploration. Specifically, we formulate the feature selection problem into an interactive reinforcement learning framework. In this framework, we first model two trainers skilled at different searching strategies: (1) KBest based trainer; (2) Decision Tree based trainer. We then develop two strategies: (1) to identify assertive and hesitant agents to diversify agent training, and (2) to enable the two trainers to take the teaching role in different stages to fuse the experience of the trainers and diversify teaching process. Such a hybrid teaching strategy can help agents to learn broader knowledge, and thereafter be more effective. Finally, we present extensive experiments on real-world datasets to demonstrate the improved performances of our method: more efficient than reinforced selection and more effective than classic feature selection.
AB - In this paper, we study the problem of balancing effectiveness and efficiency in automated feature selection. Feature selection is to find the optimal feature subset from large-scale feature space, and is a fundamental intelligence for machine learning and predictive analysis. After exploring many feature selection methods, we observe a computational dilemma: 1) traditional feature selection methods (e.g., K-Best, decision tree based ranking, mRMR) are mostly efficient, but difficult to identify the best subset; 2) the emerging reinforced feature selection methods automatically navigate feature space to explore the best subset, but are usually inefficient. Are automation and efficiency always apart from each other? Can we bridge the gap between effectiveness and efficiency under automation? Motivated by such a computational dilemma, this study is to develop a novel feature space navigation method. To that end, we propose an Interactive Reinforced Feature Selection (IRFS) framework that guides agents by not just self-exploration experience, but also diverse external skilled trainers to accelerate learning for feature exploration. Specifically, we formulate the feature selection problem into an interactive reinforcement learning framework. In this framework, we first model two trainers skilled at different searching strategies: (1) KBest based trainer; (2) Decision Tree based trainer. We then develop two strategies: (1) to identify assertive and hesitant agents to diversify agent training, and (2) to enable the two trainers to take the teaching role in different stages to fuse the experience of the trainers and diversify teaching process. Such a hybrid teaching strategy can help agents to learn broader knowledge, and thereafter be more effective. Finally, we present extensive experiments on real-world datasets to demonstrate the improved performances of our method: more efficient than reinforced selection and more effective than classic feature selection.
KW - N/a
UR - http://www.scopus.com/inward/record.url?scp=85100870765&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85100870765&partnerID=8YFLogxK
U2 - 10.1109/ICDM50108.2020.00117
DO - 10.1109/ICDM50108.2020.00117
M3 - Conference contribution
AN - SCOPUS:85100870765
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 1008
EP - 1013
BT - Proceedings - 20th IEEE International Conference on Data Mining, ICDM 2020
A2 - Plant, Claudia
A2 - Wang, Haixun
A2 - Cuzzocrea, Alfredo
A2 - Zaniolo, Carlo
A2 - Wu, Xindong
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 17 November 2020 through 20 November 2020
ER -