Robust adaptive LASSO in high-dimensional logistic regression

被引:0
|
作者
Basu, Ayanendranath [1 ]
Ghosh, Abhik [1 ]
Jaenada, Maria [2 ]
Pardo, Leandro [2 ]
机构
[1] Indian Stat Inst, Interdisciplinary Stat Res Unit, 203 BT Rd, Kolkata 700108, India
[2] Univ Complutense Madrid, Stat & OR, Plaza Ciencias 3, Madrid 28040, Spain
关键词
Density power divergence; High-dimensional data; Logistic regression; Oracle properties; Variable selection; VARIABLE SELECTION; GENE SELECTION; SPARSE REGRESSION; CLASSIFICATION; CANCER; MICROARRAYS; LIKELIHOOD; ALGORITHM; MODELS;
D O I
10.1007/s10260-024-00760-2
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Penalized logistic regression is extremely useful for binary classification with large number of covariates (higher than the sample size), having several real life applications, including genomic disease classification. However, the existing methods based on the likelihood loss function are sensitive to data contamination and other noise and, hence, robust methods are needed for stable and more accurate inference. In this paper, we propose a family of robust estimators for sparse logistic models utilizing the popular density power divergence based loss function and the general adaptively weighted LASSO penalties. We study the local robustness of the proposed estimators through its influence function and also derive its oracle properties and asymptotic distribution. With extensive empirical illustrations, we demonstrate the significantly improved performance of our proposed estimators over the existing ones with particular gain in robustness. Our proposal is finally applied to analyse four different real datasets for cancer classification, obtaining robust and accurate models, that simultaneously performs gene selection and patient classification.
引用
收藏
页数:33
相关论文
共 50 条
  • [1] Penalized logistic regression with the adaptive LASSO for gene selection in high-dimensional cancer classification
    Algamal, Zakariya Yahya
    Lee, Muhammad Hisyam
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (23) : 9326 - 9332
  • [2] Minimum Distance Lasso for robust high-dimensional regression
    Lozano, Aurelie C.
    Meinshausen, Nicolai
    Yang, Eunho
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2016, 10 (01): : 1296 - 1340
  • [3] ADAPTIVE LASSO FOR SPARSE HIGH-DIMENSIONAL REGRESSION MODELS
    Huang, Jian
    Ma, Shuangge
    Zhang, Cun-Hui
    [J]. STATISTICA SINICA, 2008, 18 (04) : 1603 - 1618
  • [4] Localized Lasso for High-Dimensional Regression
    Yamada, Makoto
    Takeuchi, Koh
    Iwata, Tomoharu
    Shawe-Taylor, John
    Kaski, Samuel
    [J]. ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 54, 2017, 54 : 325 - 333
  • [5] Adaptive Lasso in high-dimensional settings
    Lin, Zhengyan
    Xiang, Yanbiao
    Zhang, Caiya
    [J]. JOURNAL OF NONPARAMETRIC STATISTICS, 2009, 21 (06) : 683 - 696
  • [6] High-dimensional robust inference for Cox regression models using desparsified Lasso
    Kong, Shengchun
    Yu, Zhuqing
    Zhang, Xianyang
    Cheng, Guang
    [J]. SCANDINAVIAN JOURNAL OF STATISTICS, 2021, 48 (03) : 1068 - 1095
  • [7] Robust and sparse estimation methods for high-dimensional linear and logistic regression
    Kurnaz, Fatma Sevinc
    Hoffmann, Irene
    Filzmoser, Peter
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2018, 172 : 211 - 222
  • [8] Influence Diagnostics for High-Dimensional Lasso Regression
    Rajaratnam, Bala
    Roberts, Steven
    Sparks, Doug
    Yu, Honglin
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2019, 28 (04) : 877 - 890
  • [9] Robust Variable Selection with Optimality Guarantees for High-Dimensional Logistic Regression
    Insolia, Luca
    Kenney, Ana
    Calovi, Martina
    Chiaromonte, Francesca
    [J]. STATS, 2021, 4 (03): : 665 - 681
  • [10] Fully Bayesian logistic regression with hyper-LASSO priors for high-dimensional feature selection
    Li, Longhai
    Yao, Weixin
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2018, 88 (14) : 2827 - 2851