Sparse PCA for High-Dimensional Data With Outliers

被引:43
|
作者
Hubert, Mia [1 ]
Reynkens, Tom [1 ]
Schmitt, Eric [1 ]
Verdonck, Tim [1 ]
机构
[1] Katholieke Univ Leuven, Dept Math, Leuven, Belgium
关键词
Dimension reduction; Outlier detection; Robustness; PROJECTION-PURSUIT APPROACH; PRINCIPAL COMPONENTS; ROBUST PCA;
D O I
10.1080/00401706.2015.1093962
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
A new sparse PCA algorithm is presented, which is robust against outliers. The approach is based on the ROBPCA algorithm that generates robust but nonsparse loadings. The construction of the new ROSPCA method is detailed, as well as a selection criterion for the sparsity parameter. An extensive simulation study and a real data example are performed, showing that it is capable of accurately finding the sparse structure of datasets, even when challenging outliers are present. In comparison with a projection pursuit-based algorithm, ROSPCA demonstrates superior robustness properties and comparable sparsity estimation capability, as well as significantly faster computation time.
引用
收藏
页码:424 / 434
页数:11
相关论文
共 50 条
  • [31] High-dimensional sparse MANOVA
    Cai, T. Tony
    Xia, Yin
    JOURNAL OF MULTIVARIATE ANALYSIS, 2014, 131 : 174 - 196
  • [32] Wald Statistics in high-dimensional PCA
    Loffler, Matthias
    ESAIM-PROBABILITY AND STATISTICS, 2019, 23 : 662 - 671
  • [33] Sparse Learning of the Disease Severity Score for High-Dimensional Data
    Stojkovic, Ivan
    Obradovic, Zoran
    COMPLEXITY, 2017,
  • [34] A Sparse Singular Value Decomposition Method for High-Dimensional Data
    Yang, Dan
    Ma, Zongming
    Buja, Andreas
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2014, 23 (04) : 923 - 942
  • [35] EXTRACTING SPARSE HIGH-DIMENSIONAL DYNAMICS FROM LIMITED DATA
    Schaeffer, Hayden
    Tran, Giang
    Ward, Rachel
    SIAM JOURNAL ON APPLIED MATHEMATICS, 2018, 78 (06) : 3279 - 3295
  • [36] Sparse kernel k-means for high-dimensional data
    Guan, Xin
    Terada, Yoshikazu
    PATTERN RECOGNITION, 2023, 144
  • [37] Sparse redundancy analysis of high-dimensional genetic and genomic data
    Csala, Attila
    Voorbraak, Frans P. J. M.
    Zwinderman, Aeilko H.
    Hof, Michel H.
    BIOINFORMATICS, 2017, 33 (20) : 3228 - 3234
  • [38] Multiset sparse redundancy analysis for high-dimensional omics data
    Csala, Attila
    Hof, Michel H.
    Zwinderman, Aeilko H.
    BIOMETRICAL JOURNAL, 2019, 61 (02) : 406 - 423
  • [39] Sparse Bayesian variable selection for classifying high-dimensional data
    Yang, Aijun
    Lian, Heng
    Jiang, Xuejun
    Liu, Pengfei
    STATISTICS AND ITS INTERFACE, 2018, 11 (02) : 385 - 395
  • [40] Fused Feature Representation Discovery for High-Dimensional and Sparse Data
    Suzuki, Jun
    Nagata, Masaaki
    PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 1593 - 1599