Feature Selection using e-values

被引:0
|
作者
Majumdar, Subhabrata [1 ,2 ]
Chatterjee, Snigdhansu [1 ]
机构
[1] Univ Minnesota Twin Cities, Sch Stat, Minneapolis, MN 55455 USA
[2] Splunk, San Francisco, CA 94107 USA
基金
美国国家科学基金会;
关键词
VARIABLE SELECTION; MODEL; REGRESSION; BOOTSTRAP; DEPTH; DIMENSION; LASSO;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the context of supervised parametric models, we introduce the concept of e-values. An e-value is a scalar quantity that represents the proximity of the sampling distribution of parameter estimates in a model trained on a subset of features to that of the model trained on all features (i.e. the full model). Under general conditions, a rank ordering of e-values separates models that contain all essential features from those that do not. The e-values are applicable to a wide range of parametric models. We use data depths and a fast resampling-based algorithm to implement a feature selection procedure using e-values, providing consistency results. For a p-dimensional feature space, this procedure requires fitting only the full model and evaluating p + 1 models, as opposed to the traditional requirement of fitting and evaluating 2(p) models. Through experiments across several model settings and synthetic and real datasets, we establish that the e-values method as a promising general alternative to existing model-specific methods of feature selection.
引用
收藏
页数:21
相关论文
共 50 条
  • [31] Correcting BLAST e-values for low-complexity segments
    Sharon, I
    Birkland, A
    Chang, K
    El-Yaniv, R
    Yona, G
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2005, 12 (07) : 980 - 1003
  • [32] Sensitivity Analysis for Unmeasured Confounding: E-Values for Observational Studies
    Localio, A. Russell
    Stack, Catherine B.
    Griswold, Michael E.
    ANNALS OF INTERNAL MEDICINE, 2017, 167 (04) : 285 - +
  • [33] Log-optimal anytime-valid E-values
    Koolen, Wouter M.
    Grunwald, Peter
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2022, 141 : 69 - 82
  • [34] Limitations and Misinterpretations of E-Values for Sensitivity Analyses of Observational Studies
    Ioannidis, John P. A.
    Tan, Yuan Jin
    Blum, Manuel R.
    ANNALS OF INTERNAL MEDICINE, 2019, 170 (02) : 108 - +
  • [35] Calibrating E-values for hidden Markov models using reverse-sequence null models
    Karplus, K
    Karchin, R
    Shackelford, G
    Hughey, R
    BIOINFORMATICS, 2005, 21 (22) : 4107 - 4115
  • [36] Conducting sensitivity analysis for unmeasured confounding in observational studies using E-values: The evalue package
    Linden, Ariel
    Mathur, Maya B.
    VanderWeele, Tyler J.
    STATA JOURNAL, 2020, 20 (01): : 162 - 175
  • [37] Objective Priors for Invariant e-Values in the Presence of Nuisance Parameters
    Bortolato, Elena
    Ventura, Laura
    ENTROPY, 2024, 26 (01)
  • [38] E-values for k-Sample Tests with Exponential Families
    Yunda Hao
    Peter Grünwald
    Tyron Lardy
    Long Long
    Reuben Adams
    Sankhya A, 2024, 86 : 596 - 636
  • [39] Catch me if you can: signal localization with knockoff e-values
    Gablenz, Paula
    Sabatti, Chiara
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2024,
  • [40] Calibrating E-values for MS2 database search methods
    Gelio Alves
    Aleksey Y Ogurtsov
    Wells W Wu
    Guanghui Wang
    Rong-Fong Shen
    Yi-Kuo Yu
    Biology Direct, 2