INFERENCE FOR HETEROSKEDASTIC PCA WITH MISSING DATA

被引:2
|
作者
Yan, Yuling [1 ]
Chen, Yuxin [2 ]
Fan, Jianqing [3 ]
机构
[1] MIT, Inst Data Syst & Soc, Cambridge, MA 02144 USA
[2] Univ Penn, Wharton Sch, Dept Stat & Data Sci, Philadelphia, PA USA
[3] Princeton Univ, Dept Operat Res & Financial Engn, Princeton, NJ USA
来源
ANNALS OF STATISTICS | 2024年 / 52卷 / 02期
关键词
Principal component analysis; confidence regions; missing data; uncertainty quantification; heteroskedastic data; subspace estimation; LOW-RANK MATRIX; CONFIDENCE-INTERVALS; UNCERTAINTY QUANTIFICATION; PRINCIPAL COMPONENTS; SINGULAR SUBSPACES; LARGEST EIGENVALUE; ROBUST REGRESSION; GRADIENT DESCENT; COMPLETION; NOISY;
D O I
10.1214/24-AOS2366
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
This paper studies how to construct confidence regions for principal component analysis (PCA) in high dimension, a problem that has been vastly underexplored. While computing measures of uncertainty for nonlinear/nonconvex estimators is in general difficult in high dimension, the challenge is further compounded by the prevalent presence of missing data and heteroskedastic noise. We propose a novel approach to performing valid inference on the principal subspace, on the basis of an estimator called HeteroPCA guarantees for HeteroPCA, and demonstrate how these can be invoked to compute both confidence regions for the principal subspace and entrywise confidence intervals for the spiked covariance matrix. Our inference procedures are fully data-driven and adaptive to heteroskedastic random noise, without requiring prior knowledge about the noise levels.
引用
收藏
页码:729 / 756
页数:28
相关论文
共 50 条
  • [41] Monte Carlo likelihood inference for missing data models
    Sung, Yun Ju
    Geyer, Charles J.
    ANNALS OF STATISTICS, 2007, 35 (03): : 990 - 1011
  • [42] DOUBLY ROBUST INFERENCE WITH MISSING DATA IN SURVEY SAMPLING
    Kim, Jae Kwang
    Haziza, David
    STATISTICA SINICA, 2014, 24 (01) : 375 - 394
  • [43] Empirical likelihood inference for estimating equation with missing data
    Wang XiuLi
    Chen Fang
    Lin Lu
    SCIENCE CHINA-MATHEMATICS, 2013, 56 (06) : 1233 - 1245
  • [44] Bayesian inference with missing data using bound and collapse
    Sebastiani, P
    Ramoni, M
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2000, 9 (04) : 779 - 800
  • [45] SEMIPARAMETRIC ESTIMATING EQUATIONS INFERENCE WITH NONIGNORABLE MISSING DATA
    Zhao, Puying
    Tang, Niansheng
    Qu, Annie
    Jiang, Depeng
    STATISTICA SINICA, 2017, 27 (01) : 89 - 113
  • [46] Inference for partial correlation when data are missing not at random
    Gorbach, Tetiana
    de Luna, Xavier
    STATISTICS & PROBABILITY LETTERS, 2018, 141 : 82 - 89
  • [47] Inference of the kinetic Ising model with heterogeneous missing data
    Campajola, Carlo
    Lillo, Fabrizio
    Tantari, Daniele
    PHYSICAL REVIEW E, 2019, 99 (06)
  • [48] Empirical likelihood inference for estimating equation with missing data
    XiuLi Wang
    Fang Chen
    Lu Lin
    Science China Mathematics, 2013, 56 : 1233 - 1245
  • [49] Identifying key missing data for inference under uncertainty
    Wang, Jin-Chang, 1600, Publ by Elsevier Science Publ Co Inc, New York, NY, United States (10):
  • [50] CONTINUOUSLY UPDATED INDIRECT INFERENCE IN HETEROSKEDASTIC SPATIAL MODELS
    Kyriacou, Maria
    Phillips, Peter C. B.
    Rossi, Francesca
    ECONOMETRIC THEORY, 2023, 39 (01) : 107 - 145