Optimal variable selection in multi-group sparse discriminant analysis

被引:7
|
作者
Gaynanova, Irina [1 ]
Kolar, Mladen [2 ]
机构
[1] Texas A&M Univ, Dept Stat, College Stn, TX 77843 USA
[2] Univ Chicago, Booth Sch Business, Chicago, IL 60637 USA
来源
ELECTRONIC JOURNAL OF STATISTICS | 2015年 / 9卷 / 02期
关键词
Classification; Fisher's discriminant analysis; group penalization; high-dimensional statistics; MODEL SELECTION; CLASSIFICATION; CENTROIDS; RECOVERY;
D O I
10.1214/15-EJS1064
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
This article considers the problem of multi-group classification in the setting where the number of variables p is larger than the number of observations n. Several methods have been proposed in the literature that address this problem, however their variable selection performance is either unknown or suboptimal to the results known in the two-group case. In this work we provide sharp conditions for the consistent recovery of relevant variables in the multi-group case using the discriminant analysis proposal of Gaynanova et al. [7]. We achieve the rates of convergence that attain the optimal scaling of the sample size n, number of variables p and the sparsity level s. These rates are significantly faster than the best known results in the multi-group case. Moreover, they coincide with the minimax optimal rates for the two-group case. We validate our theoretical results with numerical analysis.
引用
收藏
页码:2007 / 2034
页数:28
相关论文
共 50 条
  • [21] Variable selection in discriminant analysis in the presence of outliers
    Steel, SJ
    Louw, N
    ITI 2001: PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY INTERFACES, 2001, : 251 - 256
  • [22] Sparse Maximum Margin Discriminant Analysis for Gene Selection
    Cui, Yan
    Yang, Jian
    Zheng, Chun-Hou
    BIO-INSPIRED COMPUTING AND APPLICATIONS, 2012, 6840 : 649 - +
  • [23] Analysis of new variable selection methods for discriminant analysis
    Pacheco, Joaquin
    Casado, Silvia
    Nunez, Laura
    Gomez, Olga
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2006, 51 (03) : 1463 - 1478
  • [24] Generalized linear latent variable modeling for multi-group studies
    Eickhoff, JC
    Amemiya, Y
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2005, 34 (9-10) : 1991 - 2008
  • [25] Proximal methods for sparse optimal scoring and discriminant analysis
    Atkins, Summer
    Einarsson, Gudmundur
    Clemmensen, Line
    Ames, Brendan
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2023, 17 (04) : 983 - 1036
  • [26] A Hierarchical Multi-Unidimensional IRT Approach for Analyzing Sparse, Multi-Group Data for Integrative Data Analysis
    Yan Huo
    Jimmy de la Torre
    Eun-Young Mun
    Su-Young Kim
    Anne E. Ray
    Yang Jiao
    Helene R. White
    Psychometrika, 2015, 80 : 834 - 855
  • [27] Proximal methods for sparse optimal scoring and discriminant analysis
    Summer Atkins
    Gudmundur Einarsson
    Line Clemmensen
    Brendan Ames
    Advances in Data Analysis and Classification, 2023, 17 : 983 - 1036
  • [28] A Hierarchical Multi-Unidimensional IRT Approach for Analyzing Sparse, Multi-Group Data for Integrative Data Analysis
    Huo, Yan
    de la Torre, Jimmy
    Mun, Eun-Young
    Kim, Su-Young
    Ray, Anne E.
    Jiao, Yang
    White, Helene R.
    PSYCHOMETRIKA, 2015, 80 (03) : 834 - 855
  • [29] Simultaneous variable and factor selection via sparse group lasso in factor analysis
    Dang, Yuanchu
    Wang, Qing
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2019, 89 (14) : 2744 - 2764
  • [30] Sparse optimal discriminant clustering
    Yanhong Wang
    Yixin Fang
    Junhui Wang
    Statistics and Computing, 2016, 26 : 629 - 639