Variable selection and estimation in generalized linear models with the seamless L0 penalty

被引：17

作者：

Li, Zilin ^{[2
]}

Wang, Sijian ^{[3
,4
]}

Lin, Xihong ^{[1
]}

机构：

[1] Harvard Univ, Dept Biostat, Boston, MA 02115 USA

[2] Tsinghua Univ, Dept Math, Beijing 100084, Peoples R China

[3] Univ Wisconsin, Dept Biostat & Med Informat, Madison, WI USA

[4] Univ Wisconsin, Dept Stat, Madison, WI 53706 USA

来源：

CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE | 2012年 / 40卷 / 04期

基金：

中国国家自然科学基金;

关键词：

BIC; consistency; coordinate descent algorithm; model selection; oracle property; penalized likelihood methods; SELO penalty; tuning parameter selection; NONCONCAVE PENALIZED LIKELIHOOD; ORACLE PROPERTIES; CROSS-VALIDATION; DIVERGING NUMBER; LASSO; REGRESSION; PARAMETER; CRITERION; RISK;

D O I：

10.1002/cjs.11165

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

In this paper, we propose variable selection and estimation in generalized linear models using the seamless $L_0$ (SELO) penalized likelihood approach. The SELO penalty is a smooth function that very closely resembles the discontinuous $L_0$ penalty. We develop an efficient algorithm to fit the model, and show that the SELO-GLM procedure has the oracle property in the presence of a diverging number of variables. We propose a Bayesian information criterion (BIC) to select the tuning parameter. We show that under some regularity conditions, the proposed SELO-GLM/BIC procedure consistently selects the true model. We perform simulation studies to evaluate the finite sample performance of the proposed methods. Our simulation studies show that the proposed SELO-GLM procedure has a better finite sample performance than several existing methods, especially when the number of variables is large and the signals are weak. We apply the SELO-GLM to analyze a breast cancer genetic dataset to identify the SNPs that are associated with breast cancer risk. The Canadian Journal of Statistics 40: 745769; 2012 (C) 2012 Statistical Society of Canada

引用

页码：745 / 769

页数：25

共 50 条

[41] Model Selection and Minimax Estimation in Generalized Linear Models
Abramovich, Felix
Grinshtein, Vadim
[J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2016, 62 (06) : 3721 - 3730
[42] Motion Estimation with L0 Norm Regularization
Chen, Jun
Cai, Zemin
Xie, Xiaohua
Lai, Jianhuang
[J]. 2021 IEEE 7TH INTERNATIONAL CONFERENCE ON VIRTUAL REALITY (ICVR 2021), 2021, : 127 - 134
[43] l0 Sparse Inverse Covariance Estimation
Marjanovic, Goran
Hero, Alfred O., III
[J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2015, 63 (12) : 3218 - 3231
[44] Adaptive bayesian criteria in variable selection for generalized linear models
Wang, Xinlei
George, Edward I.
[J]. STATISTICA SINICA, 2007, 17 (02) : 667 - 690
[45] ON VARIABLE SELECTION IN GENERALIZED LINEAR AND RELATED REGRESSION-MODELS
NORDBERG, L
[J]. COMMUNICATIONS IN STATISTICS PART A-THEORY AND METHODS, 1982, 11 (21): : 2427 - 2449
[46] Bayesian projection approaches to variable selection in generalized linear models
Nott, David J.
Leng, Chenlei
[J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2010, 54 (12) : 3227 - 3241
[47] Variable selection for generalized partially linear models with longitudinal data
Jinghua Zhang
Liugen Xue
[J]. Evolutionary Intelligence, 2022, 15 : 2473 - 2483
[48] Sampling schemes for Bayesian variable selection in generalized linear models
Nott, DJ
Leonte, D
[J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2004, 13 (02) : 362 - 382
[49] Variable selection for generalized partially linear models with longitudinal data
Zhang, Jinghua
Xue, Liugen
[J]. EVOLUTIONARY INTELLIGENCE, 2022, 15 (04) : 2473 - 2483
[50] Variable selection in generalized linear models with canonical link functions
Jin, M
Fang, YX
Zhao, LC
[J]. STATISTICS & PROBABILITY LETTERS, 2005, 71 (04) : 371 - 382

← 1 2 3 4 5 →