Scale-Invariant Sparse PCA on High-Dimensional Meta-Elliptical Data

被引:31
|
作者
Han, Fang [1 ]
Liu, Han [2 ]
机构
[1] Johns Hopkins Univ, Dept Biostat, Baltimore, MD 21205 USA
[2] Princeton Univ, Dept Operat Res & Financial Engn, Princeton, NJ 08544 USA
基金
美国国家科学基金会;
关键词
Elliptical distribution; High-dimensional statistics; Principal component analysis; Robust statistics; PRINCIPAL COMPONENT ANALYSIS; MULTIVARIATE LOCATION; OUTLIER DETECTION; POWER METHOD; COVARIANCE; ESTIMATORS; MATRIX; DISPERSION;
D O I
10.1080/01621459.2013.844699
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We propose a semiparametric method for conducting scale-invariant sparse principal component analysis (PCA) on high-dimensional non-Gaussian data. Compared with sparse PCA, our method has a weaker modeling assumption and is more robust to possible data contamination. Theoretically, the proposed method achieves a parametric rate of convergence in estimating the parameter of interests under a flexible semiparametric distribution family; computationally, the proposed method exploits a rank-based procedure and is as efficient as sparse PCA; empirically, our method outperforms most competing methods on both synthetic and real-world datasets.
引用
收藏
页码:275 / 287
页数:13
相关论文
共 50 条
  • [1] PCA learning for sparse high-dimensional data
    Hoyle, DC
    Rattray, M
    [J]. EUROPHYSICS LETTERS, 2003, 62 (01): : 117 - 123
  • [2] Sparse PCA for High-Dimensional Data With Outliers
    Hubert, Mia
    Reynkens, Tom
    Schmitt, Eric
    Verdonck, Tim
    [J]. TECHNOMETRICS, 2016, 58 (04) : 424 - 434
  • [3] A Simple Scale-Invariant Two-Sample Test for High-dimensional Data
    Zhang, Liang
    Zhu, Tianming
    Zhang, Jin-Ting
    [J]. ECONOMETRICS AND STATISTICS, 2020, 14 : 131 - 144
  • [4] MINIMAX BOUNDS FOR SPARSE PCA WITH NOISY HIGH-DIMENSIONAL DATA
    Birnbaum, Aharon
    Johnstone, Iain M.
    Nadler, Boaz
    Paul, Debashis
    [J]. ANNALS OF STATISTICS, 2013, 41 (03): : 1055 - 1084
  • [5] Sparse meta-analysis with high-dimensional data
    He, Qianchuan
    Zhang, Hao Helen
    Avery, Christy L.
    Lin, D. Y.
    [J]. BIOSTATISTICS, 2016, 17 (02) : 205 - 220
  • [6] Robust PCA for high-dimensional data
    Hubert, M
    Rousseeuw, PJ
    Verboven, S
    [J]. DEVELOPMENTS IN ROBUST STATISTICS, 2003, : 169 - 179
  • [7] Two-sample Behrens-Fisher problems for high-dimensional data: a normal reference scale-invariant test
    Zhang, Liang
    Zhu, Tianming
    Zhang, Jin-Ting
    [J]. JOURNAL OF APPLIED STATISTICS, 2023, 50 (03) : 456 - 476
  • [8] HYPOTHESIS TESTING IN HIGH-DIMENSIONAL LINEAR REGRESSION: A NORMAL-REFERENCE SCALE-INVARIANT TEST
    Zhu, Tianming
    Zhang, Liang
    Zhang, Jin-Ting
    [J]. STATISTICA SINICA, 2022, 32 : 1857 - 1879
  • [9] On the anonymization of sparse high-dimensional data
    Ghinita, Gabriel
    Tao, Yufei
    Kalnis, Panos
    [J]. 2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, : 715 - +
  • [10] Interpolation of sparse high-dimensional data
    Lux, Thomas C. H.
    Watson, Layne T.
    Chang, Tyler H.
    Hong, Yili
    Cameron, Kirk
    [J]. NUMERICAL ALGORITHMS, 2021, 88 (01) : 281 - 313