Feature Screening for High-Dimensional Variable Selection in Generalized Linear Models

被引:0
|
作者
Jiang, Jinzhu [1 ]
Shang, Junfeng [1 ]
机构
[1] Bowling Green State Univ, Dept Math & Stat, Bowling Green, OH 43403 USA
关键词
feature screening; high dimensional data; generalized linear models; logit model; NONCONCAVE PENALIZED LIKELIHOOD;
D O I
10.3390/e25060851
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
The two-stage feature screening method for linear models applies dimension reduction at first stage to screen out nuisance features and dramatically reduce the dimension to a moderate size; at the second stage, penalized methods such as LASSO and SCAD could be applied for feature selection. A majority of subsequent works on the sure independent screening methods have focused mainly on the linear model. This motivates us to extend the independence screening method to generalized linear models, and particularly with binary response by using the point-biserial correlation. We develop a two-stage feature screening method called point-biserial sure independence screening (PB-SIS) for high-dimensional generalized linear models, aiming for high selection accuracy and low computational cost. We demonstrate that PB-SIS is a feature screening method with high efficiency. The PB-SIS method possesses the sure independence property under certain regularity conditions. A set of simulation studies are conducted and confirm the sure independence property and the accuracy and efficiency of PB-SIS. Finally we apply PB-SIS to one real data example to show its effectiveness.
引用
收藏
页数:26
相关论文
共 50 条
  • [21] Efficient penalized generalized linear mixed models for variable selection and genetic risk prediction in high-dimensional data
    St-Pierre, Julien
    Oualkacha, Karim
    Bhatnagar, Sahir Rai
    [J]. BIOINFORMATICS, 2023, 39 (02)
  • [22] Efficient test-based variable selection for high-dimensional linear models
    Gong, Siliang
    Zhang, Kai
    Liu, Yufeng
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2018, 166 : 17 - 31
  • [23] Variable selection in multivariate linear models with high-dimensional covariance matrix estimation
    Perrot-Dockes, Marie
    Levy-Leduc, Celine
    Sansonnet, Laure
    Chiquet, Julien
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2018, 166 : 78 - 97
  • [24] Feature selection in finite mixture of sparse normal linear models in high-dimensional feature space
    Khalili, Abbas
    Chen, Jiahua
    Lin, Shili
    [J]. BIOSTATISTICS, 2011, 12 (01) : 156 - 172
  • [25] Variable selection for high-dimensional generalized linear model with block-missing data
    He, Yifan
    Feng, Yang
    Song, Xinyuan
    [J]. SCANDINAVIAN JOURNAL OF STATISTICS, 2023, 50 (03) : 1279 - 1297
  • [26] Variable selection in the high-dimensional continuous generalized linear model with current status data
    Tian, Guo-Liang
    Wang, Mingqiu
    Song, Lixin
    [J]. JOURNAL OF APPLIED STATISTICS, 2014, 41 (03) : 467 - 483
  • [27] Linear-mixed effects models for feature selection in high-dimensional NMR spectra
    Mei, Yajun
    Kim, Seoung Bum
    Tsui, Kwok-Leung
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (03) : 4703 - 4708
  • [28] A semi-parametric approach to feature selection in high-dimensional linear regression models
    Liu, Yuyang
    Pi, Pengfei
    Luo, Shan
    [J]. COMPUTATIONAL STATISTICS, 2023, 38 (02) : 979 - 1000
  • [29] A semi-parametric approach to feature selection in high-dimensional linear regression models
    Yuyang Liu
    Pengfei Pi
    Shan Luo
    [J]. Computational Statistics, 2023, 38 : 979 - 1000
  • [30] GREEDY VARIABLE SELECTION FOR HIGH-DIMENSIONAL COX MODELS
    Lin, Chien-Tong
    Cheng, Yu-Jen
    Ing, Ching-Kang
    [J]. STATISTICA SINICA, 2023, 33 : 1697 - 1719