The EBIC and a sequential procedure for feature selection in interactive linear models with high-dimensional data

被引:0
|
作者
Yawei He
Zehua Chen
机构
[1] Chongqing Jiaotong University,Department of Applied Statistics
[2] National University of Singapore,Department of Statistics and Applied Probability
关键词
High-dimensional data; EBIC; Feature selection ; Interactive model; Sequential procedure; Selection consistency;
D O I
暂无
中图分类号
学科分类号
摘要
High-dimensional data arises in many important scientific fields. The analysis of high-dimensional data poses great challenges to statisticians. In high-dimensional data, the relationship among the variables is complex. It involves main effects as well as interaction effects of the covariates. The effect of some covariates is only realized through their interaction with the others. This makes the consideration of interactive models imperative in the analysis of high-dimensional data. Because of the existence of high spurious correlation among the covariates in high-dimensional data, conventional tools for dealing with interactive models become inappropriate. In this paper, we develop specific tools for feature selection in high-dimensional data with interactive models, including a version of the extended BIC (EBIC) for interactive models and a sequential feature selection procedure. Main-effect and interaction features are treated differently in the EBIC for interactive models and the sequential procedure due to their different natures. The selection consistency of the EBIC for interactive models and the sequential procedure is established. Simulation studies are carried out to vindicate the asymptotic property in finite samples as well as to compare with non-sequential procedures. The approach developed in this paper is also applied to a real data set.
引用
收藏
页码:155 / 180
页数:25
相关论文
共 50 条
  • [41] A hybrid feature selection scheme for high-dimensional data
    Ganjei, Mohammad Ahmadi
    Boostani, Reza
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 113
  • [42] Evaluating Feature Selection Robustness on High-Dimensional Data
    Pes, Barbara
    [J]. HYBRID ARTIFICIAL INTELLIGENT SYSTEMS (HAIS 2018), 2018, 10870 : 235 - 247
  • [43] Feature selection for classifying high-dimensional numerical data
    Wu, YM
    Zhang, AD
    [J]. PROCEEDINGS OF THE 2004 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 2, 2004, : 251 - 258
  • [44] Variable selection for high-dimensional generalized linear models with the weighted elastic-net procedure
    Wang, Xiuli
    Wang, Mingqiu
    [J]. JOURNAL OF APPLIED STATISTICS, 2016, 43 (05) : 796 - 809
  • [45] A Light Causal Feature Selection Approach to High-Dimensional Data
    Ling, Zhaolong
    Li, Ying
    Zhang, Yiwen
    Yu, Kui
    Zhou, Peng
    Li, Bo
    Wu, Xindong
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (08) : 7639 - 7650
  • [46] Filter Feature Selection Performance Comparison in High-dimensional Data
    Huertas, Carlos
    Juarez-Ramirez, Reyes
    [J]. 2014 17TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), 2014,
  • [47] Feature selection based on geometric distance for high-dimensional data
    Lee, J. -H.
    Oh, S. -Y.
    [J]. ELECTRONICS LETTERS, 2016, 52 (06) : 473 - 474
  • [48] Single Sequence Fast Feature Selection for High-Dimensional Data
    Boldt, Francisco de Assis
    Rauber, Thomas W.
    Varejao, Flavio M.
    [J]. 2015 IEEE 27TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2015), 2015, : 697 - 704
  • [49] A general framework of nonparametric feature selection in high-dimensional data
    Yu, Hang
    Wang, Yuanjia
    Zeng, Donglin
    [J]. BIOMETRICS, 2023, 79 (02) : 951 - 963
  • [50] FsNet: Feature Selection Network on High-dimensional Biological Data
    Singh, Dinesh
    Climente-Gonzalez, Hector
    Petrovich, Mathis
    Kawakami, Eiryo
    Yamada, Makoto
    [J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,