A quantitative benchmark of neural network feature selection methods for detecting nonlinear signals

被引:1
|
作者
Passemiers, Antoine [1 ]
Folco, Pietro [2 ]
Raimondi, Daniele [1 ,3 ]
Birolo, Giovanni [2 ]
Moreau, Yves [1 ]
Fariselli, Piero [2 ]
机构
[1] Katholieke Univ Leuven, ESAT STADIUS, Leuven, Belgium
[2] Univ Torino, Dept Med Sci, Turin, Italy
[3] Univ Montpellier, Inst Genet Mol Montpellier, Montpellier, France
来源
SCIENTIFIC REPORTS | 2024年 / 14卷 / 01期
关键词
D O I
10.1038/s41598-024-82583-5
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Classification and regression problems can be challenging when the relevant input features are diluted in noisy datasets, in particular when the sample size is limited. Traditional Feature Selection (FS) methods address this issue by relying on some assumptions such as the linear or additive relationship between features. Recently, a proliferation of Deep Learning (DL) models has emerged to tackle both FS and prediction at the same time, allowing non-linear modeling of the selected features. In this study, we systematically assess the performance of DL-based feature selection methods on synthetic datasets of varying complexity, and benchmark their efficacy in uncovering non-linear relationships between features. We also use the same settings to benchmark the reliability of gradient-based feature attribution techniques for Neural Networks (NNs), such as Saliency Maps (SM). A quantitative evaluation of the reliability of these approaches is currently missing. Our analysis indicates that even simple synthetic datasets can significantly challenge most of the DL-based FS and SM methods, while Random Forests, TreeShap, mRMR and LassoNet are the best performing FS methods. Our conclusion is that when quantifying the relevance of a few non linearly-entangled predictive features diluted in a large number of irrelevant noisy variables, DL-based FS and SM interpretation methods are still far from being reliable.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] Neural network based approaches for detecting signals with unknown parameters
    de la Mata-Moya, David
    Jarabo-Amores, Pilar
    Rosa-Zurera, Manuel
    Vicen-Bueno, Raul
    Nieto-Borge, Jose Carlos
    2007 IEEE INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING, CONFERENCE PROCEEDINGS BOOK, 2007, : 675 - 680
  • [22] Detecting and Refactoring Feature Envy Based on Graph Neural Network
    Yu, Dongjin
    Xu, Yihang
    Weng, Lehui
    Chen, Jie
    Chen, Xin
    Yang, Quanxin
    2022 IEEE 33RD INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING (ISSRE 2022), 2022, : 458 - 469
  • [23] Benchmark for filter methods for feature selection in high-dimensional classification data
    Bommert, Andrea
    Sun, Xudong
    Bischl, Bernd
    Rahnenfuehrer, Joerg
    Lang, Michel
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2020, 143
  • [24] Boosting feature selection for Neural Network based regression
    Bailly, Kevin
    Milgram, Maurice
    NEURAL NETWORKS, 2009, 22 (5-6) : 748 - 756
  • [25] The role of feature selection in artifial neural network applications
    Kavzoglu, T
    Mather, PM
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2002, 23 (15) : 2919 - 2937
  • [26] A neural network document classifier with linguistic feature selection
    Lee, HM
    Chen, CM
    Hwang, CW
    INTELLIGENT PROBLEM SOLVING: METHODOLOGIES AND APPROACHES, PRODEEDINGS, 2000, 1821 : 555 - 560
  • [27] NEURAL NETWORK WITH SALIENCY BASED FEATURE SELECTION ABILITY
    Wang, Yunong
    Bian, Huanyu
    Yu, Nenghai
    2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 4502 - 4506
  • [28] Feature Selection, Deep Neural Network and Trend Prediction
    方艳
    Journal of Shanghai Jiaotong University(Science), 2018, 23 (02) : 297 - 307
  • [29] Feature Selection, Deep Neural Network and Trend Prediction
    Fang Y.
    Journal of Shanghai Jiaotong University (Science), 2018, 23 (2) : 297 - 307
  • [30] Hadoop neural network for parallel and distributed feature selection
    Hodge, Victoria J.
    O'Keefe, Simon
    Austin, Jim
    NEURAL NETWORKS, 2016, 78 : 24 - 35