The EBIC and a sequential procedure for feature selection in interactive linear models with high-dimensional data

被引:0
|
作者
Yawei He
Zehua Chen
机构
[1] Chongqing Jiaotong University,Department of Applied Statistics
[2] National University of Singapore,Department of Statistics and Applied Probability
关键词
High-dimensional data; EBIC; Feature selection ; Interactive model; Sequential procedure; Selection consistency;
D O I
暂无
中图分类号
学科分类号
摘要
High-dimensional data arises in many important scientific fields. The analysis of high-dimensional data poses great challenges to statisticians. In high-dimensional data, the relationship among the variables is complex. It involves main effects as well as interaction effects of the covariates. The effect of some covariates is only realized through their interaction with the others. This makes the consideration of interactive models imperative in the analysis of high-dimensional data. Because of the existence of high spurious correlation among the covariates in high-dimensional data, conventional tools for dealing with interactive models become inappropriate. In this paper, we develop specific tools for feature selection in high-dimensional data with interactive models, including a version of the extended BIC (EBIC) for interactive models and a sequential feature selection procedure. Main-effect and interaction features are treated differently in the EBIC for interactive models and the sequential procedure due to their different natures. The selection consistency of the EBIC for interactive models and the sequential procedure is established. Simulation studies are carried out to vindicate the asymptotic property in finite samples as well as to compare with non-sequential procedures. The approach developed in this paper is also applied to a real data set.
引用
收藏
页码:155 / 180
页数:25
相关论文
共 50 条
  • [21] FEATURE SELECTION FOR HIGH-DIMENSIONAL DATA ANALYSIS
    Verleysen, Michel
    [J]. ECTA 2011/FCTA 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON EVOLUTIONARY COMPUTATION THEORY AND APPLICATIONS AND INTERNATIONAL CONFERENCE ON FUZZY COMPUTATION THEORY AND APPLICATIONS, 2011,
  • [22] Feature selection in finite mixture of sparse normal linear models in high-dimensional feature space
    Khalili, Abbas
    Chen, Jiahua
    Lin, Shili
    [J]. BIOSTATISTICS, 2011, 12 (01) : 156 - 172
  • [23] Variable Selection in High-Dimensional Partially Linear Models with Longitudinal Data
    Yang Yiping
    Xue Liugen
    [J]. RECENT ADVANCE IN STATISTICS APPLICATION AND RELATED AREAS, VOLS I AND II, 2009, : 661 - 667
  • [24] BOSO: A novel feature selection algorithm for linear regression with high-dimensional data
    Valcarcel, Luis J.
    San Jose-Eneriz, Edurne L.
    Cendoya, Xabier
    Rubio, Angel L.
    Agirre, Xabier
    Prosper, Felipe L.
    Planes, Francisco
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2022, 18 (05)
  • [25] Sequential random k-nearest neighbor feature selection for high-dimensional data
    Park, Chan Hee
    Kim, Seoung Bum
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (05) : 2336 - 2342
  • [26] Linear-mixed effects models for feature selection in high-dimensional NMR spectra
    Mei, Yajun
    Kim, Seoung Bum
    Tsui, Kwok-Leung
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (03) : 4703 - 4708
  • [27] A semi-parametric approach to feature selection in high-dimensional linear regression models
    Liu, Yuyang
    Pi, Pengfei
    Luo, Shan
    [J]. COMPUTATIONAL STATISTICS, 2023, 38 (02) : 979 - 1000
  • [28] A semi-parametric approach to feature selection in high-dimensional linear regression models
    Yuyang Liu
    Pengfei Pi
    Shan Luo
    [J]. Computational Statistics, 2023, 38 : 979 - 1000
  • [29] Partial profile score feature selection in high-dimensional generalized linear interaction models
    Xu, Zengchao
    Luo, Shan
    Chen, Zehua
    [J]. STATISTICS AND ITS INTERFACE, 2022, 15 (04) : 433 - 447
  • [30] Neighborhood Component Feature Selection for High-Dimensional Data
    Yang, Wei
    Wang, Kuanquan
    Zuo, Wangmeng
    [J]. JOURNAL OF COMPUTERS, 2012, 7 (01) : 161 - 168