End-to-end Feature Selection Approach for Learning Skinny Trees

被引:0
|
作者
Ibrahim, Shibal [1 ]
Behdin, Kayhan [1 ]
Mazumder, Rahul [1 ]
机构
[1] MIT, 77 Massachusetts Ave, Cambridge, MA 02139 USA
关键词
MUTUAL INFORMATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a new optimization-based approach for feature selection in tree ensembles, an important problem in statistics and machine learning. Popular tree ensemble toolkits e.g., Gradient Boosted Trees and Random Forests support feature selection post-training based on feature importance scores, while very popular, they are known to have drawbacks. We propose Skinny Trees: an end-to-end toolkit for feature selection in tree ensembles where we train a tree ensemble while controlling the number of selected features. Our optimization-based approach learns an ensemble of differentiable trees, and simultaneously performs feature selection using a grouped l0-regularizer. We use first-order methods for optimization and present convergence guarantees for our approach. We use a dense-to-sparse regularization scheduling scheme that can lead to more expressive and sparser tree ensembles. On 15 synthetic and real-world datasets, Skinny Trees can achieve 1.5 620 feature compression rates, leading up to 10 faster inference over dense trees, without any loss in performance. Skinny Trees lead to superior feature selection than many existing toolkits e.g., in terms of AUC performance for 25% feature budget, Skinny Trees outperforms LightGBM by 10.2% (up to 37.7%), and Random Forests by 3% (up to 12.5%).
引用
收藏
页数:27
相关论文
共 50 条
  • [41] The end-to-end effects of Internet path selection
    Savage, S
    Collins, A
    Hoffman, E
    Snell, J
    Anderson, T
    ACM SIGCOMM'99 CONFERENCE: APPLICATIONS, TECHNOLOGIES, ARCHITECTURES, AND PROTOCOLS FOR COMPUTER COMMUNICATIONS, 1999, 29 (04): : 289 - 299
  • [42] End-to-end effects of Internet path selection
    Department of Computer Science and Engineering, University of Washington, Seattle, United States
    Comput Commun Rev, 4 (289-299):
  • [43] Feature Importance Ranking of Random Forest-Based End-to-End Learning Algorithm
    Yuan, Xiaoguang
    Liu, Shiruo
    Feng, Wei
    Dauphin, Gabriel
    REMOTE SENSING, 2023, 15 (21)
  • [44] End-to-End Light Field Image Compression with Multi-Domain Feature Learning
    Ye, Kangsheng
    Li, Yi
    Li, Ge
    Jin, Dengchao
    Zhao, Bo
    APPLIED SCIENCES-BASEL, 2024, 14 (06):
  • [45] Adversarial Learning of Intermediate Acoustic Feature for End-to-End Lightweight Text-to-Speech
    Yoon, Hyungchan
    Um, Seyun
    Kim, Changhwan
    Kang, Hong-Goo
    INTERSPEECH 2023, 2023, : 3023 - 3027
  • [46] End-to-End Lifelong Learning: a Framework to Achieve Plasticities of both the Feature and Classifier Constructions
    Wangli Hao
    Junsong Fan
    Zhaoxiang Zhang
    Guibo Zhu
    Cognitive Computation, 2018, 10 : 321 - 333
  • [47] Speech Recognition for Air Traffic Control via Feature Learning and End-to-End Training
    Fan, Peng
    Hua, Xiyao
    Lin, Yi
    Yang, Bo
    Zhang, Jianwei
    Ge, Wenyi
    Guo, Dongyue
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2023, E106D (04) : 538 - 544
  • [48] End-to-End Lifelong Learning: a Framework to Achieve Plasticities of both the Feature and Classifier Constructions
    Hao, Wangli
    Fan, Junsong
    Zhang, Zhaoxiang
    Zhu, Guibo
    COGNITIVE COMPUTATION, 2018, 10 (02) : 321 - 333
  • [49] Resource-Constrained Specific Emitter Identification Using End-to-End Sparse Feature Selection
    Tao, Mengyuan
    Fu, Xue
    Lin, Yun
    Wang, Yu
    Yao, Zhisheng
    Shi, Shengnan
    Gui, Guan
    IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 6067 - 6072
  • [50] E2E-FS: An End-to-End Feature Selection Method for Neural Networks
    Cancela, Brais
    Bolon-Canedo, Veronica
    Alonso-Betanzos, Amparo
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (07) : 8311 - 8323