Robust adaptive variable selection in ultra-high dimensional linear regression models

被引:2
|
作者
Ghosh, Abhik [1 ]
Jaenada, Maria [2 ]
Pardo, Leandro [2 ]
机构
[1] Indian Stat Inst, Kolkata, India
[2] Univ Complutense Madrid, Madrid, Spain
关键词
High-dimensional linear regression models; adaptive LASSO estimator; non-polynomial dimensionality; oracle property; density power divergence; NONCONCAVE PENALIZED LIKELIHOOD; DENSITY POWER DIVERGENCE; LASSO;
D O I
10.1080/00949655.2023.2262669
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We consider the problem of simultaneous variable selection and parameter estimation in an ultra-high dimensional linear regression model. The adaptive penalty functions are used in this regard to achieve the oracle variable selection property with simpler assumptions and lesser computational burden. Noting the non-robust nature of the usual adaptive procedures (e.g. adaptive LASSO) based on the squared error loss function against data contamination, quite frequent with modern large-scale data sets (e.g. noisy gene expression data, spectra and spectral data), in this paper, we present a new adaptive regularization procedure using a robust loss function based on the density power divergence (DPD) measure under a general class of error distributions. We theoretically prove that the proposed adaptive DPD-LASSO estimator of the regression coefficients is highly robust, consistent, asymptotically normal and leads to robust oracle-consistent variable selection under easily verifiable assumptions. Numerical illustrations are provided for the mostly used normal and heavy-tailed error densities. Finally, the proposal is applied to analyse an interesting spectral dataset, in the field of chemometrics, regarding the electron-probe X-ray microanalysis (EPXMA) of archaeological glass vessels from the 16th and 17th centuries.
引用
收藏
页码:571 / 603
页数:33
相关论文
共 50 条
  • [41] Robust Variable Selection with Optimality Guarantees for High-Dimensional Logistic Regression
    Insolia, Luca
    Kenney, Ana
    Calovi, Martina
    Chiaromonte, Francesca
    [J]. STATS, 2021, 4 (03): : 665 - 681
  • [42] Rates of convergence of the adaptive elastic net and the post-selection procedure in ultra-high dimensional sparse models
    Yang, Yuehan
    Yang, Hu
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2021, 50 (01) : 73 - 94
  • [43] Robust variable selection for finite mixture regression models
    Qingguo Tang
    R. J. Karunamuni
    [J]. Annals of the Institute of Statistical Mathematics, 2018, 70 : 489 - 521
  • [44] Robust variable selection in high-dimensional varying coefficient models based on weighted composite quantile regression
    Guo, Chaohui
    Yang, Hu
    Lv, Jing
    [J]. STATISTICAL PAPERS, 2017, 58 (04) : 1009 - 1033
  • [45] Robust variable selection in high-dimensional varying coefficient models based on weighted composite quantile regression
    Chaohui Guo
    Hu Yang
    Jing Lv
    [J]. Statistical Papers, 2017, 58 : 1009 - 1033
  • [46] Variable selection in robust regression models for longitudinal data
    Fan, Yali
    Qin, Guoyou
    Zhu, Zhongyi
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2012, 109 : 156 - 167
  • [47] Robust variable selection for finite mixture regression models
    Tang, Qingguo
    Karunamuni, R. J.
    [J]. ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2018, 70 (03) : 489 - 521
  • [48] Nonnegative adaptive lasso for ultra-high dimensional regression models and a two-stage method applied in financial modeling
    Yang, Yuehan
    Wu, Lan
    [J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2016, 174 : 52 - 67
  • [49] Model Based Screening Embedded Bayesian Variable Selection for Ultra-high Dimensional Settings
    Li, Dongjin
    Dutta, Somak
    Roy, Vivekananda
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2023, 32 (01) : 61 - 73
  • [50] Category-Adaptive Variable Screening for Ultra-High Dimensional Heterogeneous Categorical Data
    Xie, Jinhan
    Lin, Yuanyuan
    Yan, Xiaodong
    Tang, Niansheng
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2020, 115 (530) : 747 - 760