High dimension low sample size asymptotics of robust PCA

被引:6
|
作者
Zhou, Yi-Hui [1 ]
Marron, J. S. [2 ]
机构
[1] N Carolina State Univ, Dept Biol Sci, Raleigh, NC 27695 USA
[2] Univ N Carolina, Dept Stat & Operat Res, Chapel Hill, NC 27515 USA
来源
ELECTRONIC JOURNAL OF STATISTICS | 2015年 / 9卷 / 01期
关键词
Outlier; robustness; spherical PCA; spike model;
D O I
10.1214/15-EJS992
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Conventional principal component analysis is highly susceptible to outliers. In particular, a sufficiently outlying single data point, can draw the leading principal component toward itself. In this paper, we study the effects of outliers for high dimension and low sample size data, using asymptotics. The non-robust nature of conventional principal component analysis is verified through inconsistency under multivariate Gaussian assumptions with a single spike in the covariance structure, in the presence of a contaminating outlier. In the same setting, the robust method of spherical principal components is consistent with the population eigenvector for the spike model, even in the presence of contamination.
引用
收藏
页码:204 / 218
页数:15
相关论文
共 50 条
  • [1] Boundary behavior in High Dimension, Low Sample Size asymptotics of PCA
    Jung, Sungkyu
    Sen, Arusharka
    Marron, J. S.
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2012, 109 : 190 - 203
  • [2] A survey of high dimension low sample size asymptotics
    Aoshima, Makoto
    Shen, Dan
    Shen, Haipeng
    Yata, Kazuyoshi
    Zhou, Yi-Hui
    Marron, J. S.
    [J]. AUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, 2018, 60 (01) : 4 - 19
  • [3] THE STATISTICS AND MATHEMATICS OF HIGH DIMENSION LOW SAMPLE SIZE ASYMPTOTICS
    Shen, Dan
    Shen, Haipeng
    Zhu, Hongtu
    Marron, J. S.
    [J]. STATISTICA SINICA, 2016, 26 (04) : 1747 - 1770
  • [4] PCA CONSISTENCY IN HIGH DIMENSION, LOW SAMPLE SIZE CONTEXT
    Jung, Sungkyu
    Marron, J. S.
    [J]. ANNALS OF STATISTICS, 2009, 37 (6B): : 4104 - 4130
  • [5] Consistency of sparse PCA in High Dimension, Low Sample Size contexts
    Shen, Dan
    Shen, Haipeng
    Marron, J. S.
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2013, 115 : 317 - 333
  • [6] On Some Fast And Robust Classifiers For High Dimension, Low Sample Size Data
    Roy, Sarbojit
    Choudhury, Jyotishka Ray
    Dutta, Subhajit
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [7] PCA Consistency for Non-Gaussian Data in High Dimension, Low Sample Size Context
    Yata, Kazuyoshi
    Aoshima, Makoto
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2009, 38 (16-17) : 2634 - 2652
  • [8] Intrinsic Dimensionality Estimation of High-Dimension, Low Sample Size Data with D-Asymptotics
    Yata, Kazuyoshi
    Aoshima, Makoto
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2010, 39 (8-9) : 1511 - 1521
  • [9] On asymptotic normality of cross data matrix-based PCA in high dimension low sample size
    Wang, Shao-Hsuan
    Huang, Su-Yun
    Chen, Ting-Li
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2020, 175
  • [10] Robust centroid based classification with minimum error rates for high dimension, low sample size data
    Jiang, Jiancheng
    Marron, J. S.
    Jiang, Xuejun
    [J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2009, 139 (08) : 2571 - 2580