Feature Selection using e-values

被引:0
|
作者
Majumdar, Subhabrata [1 ,2 ]
Chatterjee, Snigdhansu [1 ]
机构
[1] Univ Minnesota Twin Cities, Sch Stat, Minneapolis, MN 55455 USA
[2] Splunk, San Francisco, CA 94107 USA
基金
美国国家科学基金会;
关键词
VARIABLE SELECTION; MODEL; REGRESSION; BOOTSTRAP; DEPTH; DIMENSION; LASSO;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the context of supervised parametric models, we introduce the concept of e-values. An e-value is a scalar quantity that represents the proximity of the sampling distribution of parameter estimates in a model trained on a subset of features to that of the model trained on all features (i.e. the full model). Under general conditions, a rank ordering of e-values separates models that contain all essential features from those that do not. The e-values are applicable to a wide range of parametric models. We use data depths and a fast resampling-based algorithm to implement a feature selection procedure using e-values, providing consistency results. For a p-dimensional feature space, this procedure requires fitting only the full model and evaluating p + 1 models, as opposed to the traditional requirement of fitting and evaluating 2(p) models. Through experiments across several model settings and synthetic and real datasets, we establish that the e-values method as a promising general alternative to existing model-specific methods of feature selection.
引用
收藏
页数:21
相关论文
共 50 条
  • [41] Calibrating e-values for MS2 database search methods
    Alves, Gelio
    Ogurtsov, Aleksey Y.
    Wu, Wells W.
    Wang, Guanghui
    Shen, Rong-Fong
    Yu, Yi-Kuo
    BIOLOGY DIRECT, 2007, 2 (1)
  • [42] Derandomised knockoffs: leveraging e-values for false discovery rate control
    Ren, Zhimei
    Barber, Rina Foygel
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2024, 86 (01) : 122 - 154
  • [43] Monomer reactivity ratios and Q,e-values of trialkylsilyl acrylates and methacrylates
    Fujiwara, H
    Narita, T
    Hamana, H
    POLYMER JOURNAL, 2001, 33 (01) : 102 - 103
  • [44] C-13 NMR SPECTRA AND E-VALUES OF VINYL COMPOUNDS
    HATADA, K
    NAGATA, K
    YUKI, H
    BULLETIN OF THE CHEMICAL SOCIETY OF JAPAN, 1970, 43 (10) : 3267 - +
  • [45] Derandomized novelty detection with FDR control via conformal e-values
    Bashari, Meshi
    Epstein, Amir
    Romano, Yaniv
    Sesia, Matteo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [46] Commentary: Developing best-practice guidelines for the reporting of E-values
    VanderWeele, Tyler J.
    Mathur, Maya B.
    INTERNATIONAL JOURNAL OF EPIDEMIOLOGY, 2020, 49 (05) : 1495 - 1497
  • [47] Monomer Reactivity Ratios and Q,e-Values of Trialkylsilyl Acrylates and Methacrylates
    Hirotada Fujiwara
    Tadashi Narita
    Hiroshi Hamana
    Polymer Journal, 2001, 33 : 102 - 103
  • [48] E-VALUES, L-VALUES AND A-VALUES FOR ESTIMATION OF PLANT-AVAILABLE SOIL-PHOSPHORUS
    REDDY, NV
    SAXENA, MC
    SRINIVASULU, R
    PLANT AND SOIL, 1982, 69 (01) : 3 - 11
  • [49] PG FRACTION AND E-VALUES OF HUMIC ACIDS IN VOLCANIC SOILS OF HUMID SUBTROPICS
    MILANOVSKIY, YY
    SOVIET SOIL SCIENCE, 1989, 21 (05): : 119 - 124
  • [50] The identification of complete domains within protein sequences using accurate E-values for semi-global alignment
    Kann, Maricel G.
    Sheetlin, Sergey L.
    Park, Yonil
    Bryant, Stephen H.
    Spouge, John L.
    NUCLEIC ACIDS RESEARCH, 2007, 35 (14) : 4678 - 4685