Sparse Neural Additive Model: Interpretable Deep Learning with Feature Selection via Group Sparsity

被引:2
|
作者
Xu, Shiyun [1 ]
Bu, Zhiqi [1 ]
Chaudhari, Pratik [2 ]
Barnett, Ian J. [3 ]
机构
[1] Univ Penn, Dept Appl Math & Computat Sci, Philadelphia, PA 19104 USA
[2] Univ Penn, Dept Elect & Syst Engn, Philadelphia, PA 19104 USA
[3] Univ Penn, Dept Biostat Epidemiol & Informat, Philadelphia, PA 19104 USA
基金
美国国家科学基金会;
关键词
Interpretability; Additive Models; Group LASSO; Feature Selection; VARIABLE SELECTION; LASSO; REGRESSION; SHRINKAGE;
D O I
10.1007/978-3-031-43418-1_21
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Interpretable machine learning has demonstrated impressive performance while preserving explainability. In particular, neural additive models (NAM) offer the interpretability to the black-box deep learning and achieve state-of-the-art accuracy among the large family of generalized additive models. In order to empower NAM with feature selection and improve the generalization, we propose the sparse neural additive models (SNAM) that employ the group sparsity regularization (e.g. Group LASSO), where each feature is learned by a sub-network whose trainable parameters are clustered as a group. We study the theoretical properties for SNAM with novel techniques to tackle the non-parametric truth, thus extending from classical sparse linear models such as the LASSO, which only works on the parametric truth. Specifically, we show that SNAM with subgradient and proximal gradient descents provably converges to zero training loss as t -> infinity, and that the estimation error of SNAM vanishes asymptotically as n -> infinity. We also prove that SNAM, similar to LASSO, can have exact support recovery, i.e. perfect feature selection, with appropriate regularization. Moreover, we show that the SNAM can generalize well and preserve the 'identifiability', recovering each feature's effect. We validate our theories via extensive experiments and further testify to the good accuracy and efficiency of SNAM (Appendix can be found at https://arxiv.org/abs/2202.12482.).
引用
收藏
页码:343 / 359
页数:17
相关论文
共 50 条
  • [41] Theoretical Foundations of Deep Learning via Sparse Representations A multilayer sparse model and its connection to convolutional neural networks
    Papyan, Vardan
    Romano, Yaniv
    Sulam, Jeremias
    Elad, Michael
    IEEE SIGNAL PROCESSING MAGAZINE, 2018, 35 (04) : 72 - 89
  • [42] Hypergraph-Based Multitask Feature Selection with Temporally Constrained Group Sparsity Learning on fMRI
    Qu, Youzhi
    Fu, Kai
    Wang, Linjing
    Zhang, Yu
    Wu, Haiyan
    Liu, Quanying
    MATHEMATICS, 2024, 12 (11)
  • [43] SpTFS: Sparse Tensor Format Selection for MTTKRP via Deep Learning
    Sun, Qingxiao
    Liu, Yi
    Dun, Ming
    Yang, Hailong
    Luan, Zhongzhi
    Gan, Lin
    Yang, Guangwen
    Qian, Depei
    PROCEEDINGS OF SC20: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC20), 2020,
  • [44] Weighted feature selection via discriminative sparse multi-view learning
    Zhong, Jing
    Wang, Nan
    Lin, Qiang
    Zhong, Ping
    KNOWLEDGE-BASED SYSTEMS, 2019, 178 : 132 - 148
  • [45] Deep Natural Language Feature Learning for Interpretable Prediction
    Urrutia, Felipe
    Buc, Cristian
    Barriere, Valentin
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 3736 - 3763
  • [46] Feature Analysis Network: An Interpretable Idea in Deep Learning
    Li, Xinyu
    Gao, Xiaoguang
    Wang, Qianglong
    Wang, Chenfeng
    Li, Bo
    Wan, Kaifang
    COGNITIVE COMPUTATION, 2024, 16 (03) : 803 - 826
  • [47] Deep Neural Network Regularization for Feature Selection in Learning-to-Rank
    Rahangdale, Ashwini
    Raut, Shital
    IEEE ACCESS, 2019, 7 : 53988 - 54006
  • [48] Integrated Learning and Feature Selection for Deep Neural Networks in Multispectral Images
    Ortiz, Anthony
    Granados, Alonso
    Fuentes, Olac
    Kiekintyeld, Christopher
    Rosario, Dalton
    Bell, Zachary
    PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, : 1277 - 1286
  • [49] Feature selection via kernel sparse representation
    Lv, Zhizheng
    Li, Yangding
    Li, Jieye
    2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 2637 - 2644
  • [50] Simultaneous Model Selection and Feature Selection via BYY Harmony Learning
    Wang, Hongyan
    Ma, Jinwen
    ADVANCES IN NEURAL NETWORKS - ISNN 2011, PT II, 2011, 6676 : 47 - +