Robust adaptive LASSO in high-dimensional logistic regression

被引:0
|
作者
Basu, Ayanendranath [1 ]
Ghosh, Abhik [1 ]
Jaenada, Maria [2 ]
Pardo, Leandro [2 ]
机构
[1] Indian Stat Inst, Interdisciplinary Stat Res Unit, 203 BT Rd, Kolkata 700108, India
[2] Univ Complutense Madrid, Stat & OR, Plaza Ciencias 3, Madrid 28040, Spain
关键词
Density power divergence; High-dimensional data; Logistic regression; Oracle properties; Variable selection; VARIABLE SELECTION; GENE SELECTION; SPARSE REGRESSION; CLASSIFICATION; CANCER; MICROARRAYS; LIKELIHOOD; ALGORITHM; MODELS;
D O I
10.1007/s10260-024-00760-2
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Penalized logistic regression is extremely useful for binary classification with large number of covariates (higher than the sample size), having several real life applications, including genomic disease classification. However, the existing methods based on the likelihood loss function are sensitive to data contamination and other noise and, hence, robust methods are needed for stable and more accurate inference. In this paper, we propose a family of robust estimators for sparse logistic models utilizing the popular density power divergence based loss function and the general adaptively weighted LASSO penalties. We study the local robustness of the proposed estimators through its influence function and also derive its oracle properties and asymptotic distribution. With extensive empirical illustrations, we demonstrate the significantly improved performance of our proposed estimators over the existing ones with particular gain in robustness. Our proposal is finally applied to analyse four different real datasets for cancer classification, obtaining robust and accurate models, that simultaneously performs gene selection and patient classification.
引用
收藏
页数:33
相关论文
共 50 条
  • [11] LASSO Isotone for High-Dimensional Additive Isotonic Regression
    Fang, Zhou
    Meinshausen, Nicolai
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2012, 21 (01) : 72 - 91
  • [12] Spline-Lasso in High-Dimensional Linear Regression
    Guo, Jianhua
    Hu, Jianchang
    Jing, Bing-Yi
    Zhang, Zhen
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2016, 111 (513) : 288 - 297
  • [13] ASYMPTOTIC ANALYSIS OF HIGH-DIMENSIONAL LAD REGRESSION WITH LASSO
    Gao, Xiaoli
    Huang, Jian
    [J]. STATISTICA SINICA, 2010, 20 (04) : 1485 - 1506
  • [14] High-Dimensional Classification by Sparse Logistic Regression
    Abramovich, Felix
    Grinshtein, Vadim
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2019, 65 (05) : 3068 - 3079
  • [15] The Impact of Regularization on High-dimensional Logistic Regression
    Salehi, Fariborz
    Abbasi, Ehsan
    Hassibi, Babak
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [16] On robust regression with high-dimensional predictors
    El Karoui, Noureddine
    Bean, Derek
    Bickel, Peter J.
    Lim, Chinghway
    Yu, Bin
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2013, 110 (36) : 14557 - 14562
  • [17] The adaptive lasso in high-dimensional sparse heteroscedastic models
    Wagener J.
    Dette H.
    [J]. Mathematical Methods of Statistics, 2013, 22 (2) : 137 - 154
  • [18] The joint lasso: high-dimensional regression for group structured data
    Dondelinger, Frank
    Mukherjee, Sach
    [J]. BIOSTATISTICS, 2020, 21 (02) : 219 - 235
  • [19] The sparsity and bias of the lasso selection in high-dimensional linear regression
    Zhang, Cun-Hui
    Huang, Jian
    [J]. ANNALS OF STATISTICS, 2008, 36 (04): : 1567 - 1594
  • [20] A MODEL OF DOUBLE DESCENT FOR HIGH-DIMENSIONAL LOGISTIC REGRESSION
    Deng, Zeyu
    Kammoun, Abla
    Thrampoulidis, Christos
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 4267 - 4271