Robust adaptive variable selection in ultra-high dimensional linear regression models

被引:3
|
作者
Ghosh, Abhik [1 ]
Jaenada, Maria [2 ]
Pardo, Leandro [2 ]
机构
[1] Indian Stat Inst, Kolkata, India
[2] Univ Complutense Madrid, Madrid, Spain
关键词
High-dimensional linear regression models; adaptive LASSO estimator; non-polynomial dimensionality; oracle property; density power divergence; NONCONCAVE PENALIZED LIKELIHOOD; DENSITY POWER DIVERGENCE; LASSO;
D O I
10.1080/00949655.2023.2262669
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We consider the problem of simultaneous variable selection and parameter estimation in an ultra-high dimensional linear regression model. The adaptive penalty functions are used in this regard to achieve the oracle variable selection property with simpler assumptions and lesser computational burden. Noting the non-robust nature of the usual adaptive procedures (e.g. adaptive LASSO) based on the squared error loss function against data contamination, quite frequent with modern large-scale data sets (e.g. noisy gene expression data, spectra and spectral data), in this paper, we present a new adaptive regularization procedure using a robust loss function based on the density power divergence (DPD) measure under a general class of error distributions. We theoretically prove that the proposed adaptive DPD-LASSO estimator of the regression coefficients is highly robust, consistent, asymptotically normal and leads to robust oracle-consistent variable selection under easily verifiable assumptions. Numerical illustrations are provided for the mostly used normal and heavy-tailed error densities. Finally, the proposal is applied to analyse an interesting spectral dataset, in the field of chemometrics, regarding the electron-probe X-ray microanalysis (EPXMA) of archaeological glass vessels from the 16th and 17th centuries.
引用
收藏
页码:571 / 603
页数:33
相关论文
共 50 条
  • [21] Profile forward regression screening for ultra-high dimensional semiparametric varying coefficient partially linear models
    Li, Yujie
    Li, Gaorong
    Lian, Heng
    Tong, Tiejun
    JOURNAL OF MULTIVARIATE ANALYSIS, 2017, 155 : 133 - 150
  • [22] Robust Information Criterion for Model Selection in Sparse High-Dimensional Linear Regression Models
    Gohain, Prakash Borpatra
    Jansson, Magnus
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2023, 71 : 2251 - 2266
  • [23] Penalized estimation in finite mixture of ultra-high dimensional regression models
    Tang, Shiyi
    Zheng, Jiali
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2022, 51 (17) : 5971 - 5992
  • [24] Variable selection for ultra-high-dimensional logistic models
    Du, Pang
    Wu, Pan
    Liang, Hua
    PERSPECTIVES ON BIG DATA ANALYSIS: METHODOLOGIES AND APPLICATIONS, 2014, 622 : 141 - 158
  • [25] A group adaptive elastic-net approach for variable selection in high-dimensional linear regression
    Hu, Jianhua
    Huang, Jian
    Qiu, Feng
    SCIENCE CHINA-MATHEMATICS, 2018, 61 (01) : 173 - 188
  • [26] A group adaptive elastic-net approach for variable selection in high-dimensional linear regression
    Jianhua Hu
    Jian Huang
    Feng Qiu
    Science China Mathematics, 2018, 61 : 173 - 188
  • [27] A group adaptive elastic-net approach for variable selection in high-dimensional linear regression
    Jianhua Hu
    Jian Huang
    Feng Qiu
    Science China(Mathematics), 2018, 61 (01) : 173 - 188
  • [28] Variable selection and transformation in linear regression models
    Yeo, IK
    STATISTICS & PROBABILITY LETTERS, 2005, 72 (03) : 219 - 226
  • [29] On the Consistency of Bayesian Variable Selection for High Dimensional Linear Models
    Wang, Shuyun
    Luan, Yihui
    PROCEEDINGS OF THE 2009 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND NATURAL COMPUTING, VOL II, 2009, : 211 - 214
  • [30] Early stopping aggregation in selective variable selection ensembles for high-dimensional linear regression models
    Zhang, Chun-Xia
    Zhang, Jiang-She
    Yin, Qing-Yan
    KNOWLEDGE-BASED SYSTEMS, 2018, 153 : 1 - 11