Sparse Neural Additive Model: Interpretable Deep Learning with Feature Selection via Group Sparsity

被引:2
|
作者
Xu, Shiyun [1 ]
Bu, Zhiqi [1 ]
Chaudhari, Pratik [2 ]
Barnett, Ian J. [3 ]
机构
[1] Univ Penn, Dept Appl Math & Computat Sci, Philadelphia, PA 19104 USA
[2] Univ Penn, Dept Elect & Syst Engn, Philadelphia, PA 19104 USA
[3] Univ Penn, Dept Biostat Epidemiol & Informat, Philadelphia, PA 19104 USA
基金
美国国家科学基金会;
关键词
Interpretability; Additive Models; Group LASSO; Feature Selection; VARIABLE SELECTION; LASSO; REGRESSION; SHRINKAGE;
D O I
10.1007/978-3-031-43418-1_21
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Interpretable machine learning has demonstrated impressive performance while preserving explainability. In particular, neural additive models (NAM) offer the interpretability to the black-box deep learning and achieve state-of-the-art accuracy among the large family of generalized additive models. In order to empower NAM with feature selection and improve the generalization, we propose the sparse neural additive models (SNAM) that employ the group sparsity regularization (e.g. Group LASSO), where each feature is learned by a sub-network whose trainable parameters are clustered as a group. We study the theoretical properties for SNAM with novel techniques to tackle the non-parametric truth, thus extending from classical sparse linear models such as the LASSO, which only works on the parametric truth. Specifically, we show that SNAM with subgradient and proximal gradient descents provably converges to zero training loss as t -> infinity, and that the estimation error of SNAM vanishes asymptotically as n -> infinity. We also prove that SNAM, similar to LASSO, can have exact support recovery, i.e. perfect feature selection, with appropriate regularization. Moreover, we show that the SNAM can generalize well and preserve the 'identifiability', recovering each feature's effect. We validate our theories via extensive experiments and further testify to the good accuracy and efficiency of SNAM (Appendix can be found at https://arxiv.org/abs/2202.12482.).
引用
收藏
页码:343 / 359
页数:17
相关论文
共 50 条
  • [21] Deep Neural Networks Regularization Using a Combination of Sparsity Inducing Feature Selection Methods
    Fatemeh Farokhmanesh
    Mohammad Taghi Sadeghi
    Neural Processing Letters, 2021, 53 : 701 - 720
  • [22] Deep Neural Networks Regularization Using a Combination of Sparsity Inducing Feature Selection Methods
    Farokhmanesh, Fatemeh
    Sadeghi, Mohammad Taghi
    NEURAL PROCESSING LETTERS, 2021, 53 (01) : 701 - 720
  • [23] Nonlinear Feature Selection Neural Network via Structured Sparse Regularization
    Wang, Rong
    Bian, Jintang
    Nie, Feiping
    Li, Xuelong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (11) : 9493 - 9505
  • [24] Group Feature Selection Via Structural Sparse Logistic Regression for IDS
    Shah, Reehan Ali
    Qian, Yuntao
    Mahdi, Ghulam
    PROCEEDINGS OF 2016 IEEE 18TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS; IEEE 14TH INTERNATIONAL CONFERENCE ON SMART CITY; IEEE 2ND INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2016, : 594 - 600
  • [25] Interpretable Deep Convolutional Neural Networks via Meta-learning
    Liu, Xuan
    Wang, Xiaoguang
    Matwin, Stan
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [26] SparTA: Deep-Learning Model Sparsity via Tensor-with-Sparsity-Attribute
    Zheng, Ningxin
    Lin, Bin
    Zhang, Quanlu
    Ma, Lingxiao
    Yang, Yuqing
    Yang, Fan
    Wang, Yang
    Yang, Mao
    Zhou, Lidong
    PROCEEDINGS OF THE 16TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, OSDI 2022, 2022, : 213 - 232
  • [27] Robust Feature Selection with Feature Correlation via Sparse Multi-Label Learning
    Jiangjiang Cheng
    Junmei Mei
    Jing Zhong
    Min Men
    Ping Zhong
    Pattern Recognition and Image Analysis, 2020, 30 : 52 - 62
  • [28] Robust Feature Selection with Feature Correlation via Sparse Multi-Label Learning
    Cheng, Jiangjiang
    Mei, Junmei
    Zhong, Jing
    Men, Min
    Zhong, Ping
    PATTERN RECOGNITION AND IMAGE ANALYSIS, 2020, 30 (01) : 52 - 62
  • [29] Dictionary learning for unsupervised feature selection via dual sparse regression
    Wu, Jian-Sheng
    Liu, Jing-Xin
    Wu, Jun-Yun
    Huang, Wei
    APPLIED INTELLIGENCE, 2023, 53 (15) : 18840 - 18856
  • [30] Discriminative Feature Selection via A Structured Sparse Subspace Learning Module
    Wang, Zheng
    Nie, Feiping
    Tian, Lai
    Wang, Rong
    Li, Xuelong
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3009 - 3015