Minimax Sparse Logistic Regression for Very High-Dimensional Feature Selection

被引:37
|
作者
Tan, Mingkui [1 ]
Tsang, Ivor W. [1 ]
Wang, Li [2 ]
机构
[1] Nanyang Technol Univ, Sch Comp Engn, Singapore 639798, Singapore
[2] Univ Calif San Diego, Dept Math, San Diego, CA 92093 USA
关键词
Feature selection; minimax problem; single-nucleotide polymorphism (SNP) detection; smoothing method; sparse logistic regression; GENE; CLASSIFICATION; ALGORITHM; OPTIMIZATION;
D O I
10.1109/TNNLS.2013.2263427
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Because of the strong convexity and probabilistic underpinnings, logistic regression (LR) is widely used in many real-world applications. However, in many problems, such as bioinformatics, choosing a small subset of features with the most discriminative power are desirable for interpreting the prediction model, robust predictions or deeper analysis. To achieve a sparse solution with respect to input features, many sparse LR models are proposed. However, it is still challenging for them to efficiently obtain unbiased sparse solutions to very high-dimensional problems (e. g., identifying the most discriminative subset from millions of features). In this paper, we propose a new minimax sparse LR model for very high-dimensional feature selections, which can be efficiently solved by a cutting plane algorithm. To solve the resultant nonsmooth minimax subproblems, a smoothing coordinate descent method is presented. Numerical issues and convergence rate of this method are carefully studied. Experimental results on several synthetic and real-world datasets show that the proposed method can obtain better prediction accuracy with the same number of selected features and has better or competitive scalability on very high-dimensional problems compared with the baseline methods, including the l(1)-regularized LR.
引用
收藏
页码:1609 / 1622
页数:14
相关论文
共 50 条
  • [1] High-Dimensional Classification by Sparse Logistic Regression
    Abramovich, Felix
    Grinshtein, Vadim
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2019, 65 (05) : 3068 - 3079
  • [2] Sparse Bayesian variable selection in high-dimensional logistic regression models with correlated priors
    Ma, Zhuanzhuan
    Han, Zifei
    Ghosh, Souparno
    Wu, Liucang
    Wang, Min
    [J]. STATISTICAL ANALYSIS AND DATA MINING, 2024, 17 (01)
  • [3] NEARLY OPTIMAL MINIMAX ESTIMATOR FOR HIGH-DIMENSIONAL SPARSE LINEAR REGRESSION
    Zhang, Li
    [J]. ANNALS OF STATISTICS, 2013, 41 (04): : 2149 - 2175
  • [4] Fully Bayesian logistic regression with hyper-LASSO priors for high-dimensional feature selection
    Li, Longhai
    Yao, Weixin
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2018, 88 (14) : 2827 - 2851
  • [5] Preconditioning for feature selection and regression in high-dimensional problems'
    Paul, Debashis
    Bair, Eric
    Hastie, Trevor
    Tibshirani, Robert
    [J]. ANNALS OF STATISTICS, 2008, 36 (04): : 1595 - 1618
  • [6] Efficient Learning and Feature Selection in High-Dimensional Regression
    Ting, Jo-Anne
    D'Souza, Aaron
    Vijayakumar, Sethu
    Schaal, Stefan
    [J]. NEURAL COMPUTATION, 2010, 22 (04) : 831 - 886
  • [7] Robust and sparse estimation methods for high-dimensional linear and logistic regression
    Kurnaz, Fatma Sevinc
    Hoffmann, Irene
    Filzmoser, Peter
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2018, 172 : 211 - 222
  • [8] Feature selection for high-dimensional regression via sparse LSSVR based on Lp-norm
    Li, Chun-Na
    Shao, Yuan-Hai
    Zhao, Da
    Guo, Yan-Ru
    Hua, Xiang-Yu
    [J]. INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2021, 36 (02) : 1108 - 1130
  • [9] Nonnegative estimation and variable selection under minimax concave penalty for sparse high-dimensional linear regression models
    Li, Ning
    Yang, Hu
    [J]. STATISTICAL PAPERS, 2021, 62 (02) : 661 - 680
  • [10] Nonnegative estimation and variable selection under minimax concave penalty for sparse high-dimensional linear regression models
    Ning Li
    Hu Yang
    [J]. Statistical Papers, 2021, 62 : 661 - 680