Using principal component analysis and correspondence analysis for estimation in latent variable models

被引:10
|
作者
Lynn, HS [1 ]
McCulloch, CE
机构
[1] Rho Inc, Chapel Hill, NC 27514 USA
[2] Cornell Univ, Biometr Unit, Dept Stat Sci, Ithaca, NY 14853 USA
关键词
consistency; correspondence analysis; incidental parameters; principal component analysis;
D O I
10.2307/2669399
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Correspondence analysis (CA) and principal component analysis (PCA) are often used to describe multivariate data. In certain applications they have been used for estimation in latent variable models. The theoretical basis for such inference is assessed in generalized linear models where the linear predictor equals alpha(j) + x(i)beta(j) or a(j) - b(j) (x(i) - u(j))(2), (i = 1, ..., n; j = 1, ..., m), and x(i) is treated as a latent fixed effect. The PCA and CA eigenvectors/column scores are evaluated as estimators of beta(j) and u(j) and as estimators of u(j). With m fixed and n up arrow infinity, consistent estimators cannot be obtained due to the incidental parameters problem unless sufficient "moment" conditions are imposed on x(i). PCA is equivalent to maximum likelihood estimation for the linear Gaussian model and gives a consistent estimator of beta(j) (up to a scale change) when the second sample moment of x(i) is positive and finite in the limit. It is inconsistent for Poisson and Bernoulli distributions, but when b(j) is constant, its first and/or second eigenvectors can consistently estimate u(j) (up to a location and scale change) for the quadratic Gaussian model. In contrast, the CA estimator is always inconsistent. For finite samples, however, the CA column scores often have high correlations with the u(j)'s, especially when the response curves are spread out relative to one another. The correlations obtained from PCA are usually weaker, although the second PCA eigenvector can sometimes do much better than the first eigenvector, and for incidence data with tightly clustered response curves its performance is comparable to that of CA. For small sample sizes, PCA and particularly CA are competitive alternatives to maximum likelihood and may be preferred because of their computational ease.
引用
收藏
页码:561 / 572
页数:12
相关论文
共 50 条
  • [31] Flexible Minimum Variance Weights Estimation Using Principal Component Analysis
    Kim, Kyuhong
    Park, Suhyun
    Kim, Yun-Tae
    Park, Sung-Chan
    Kang, Jooyoung
    Kim, Jung-Ho
    Bae, Mooho
    2012 IEEE INTERNATIONAL ULTRASONICS SYMPOSIUM (IUS), 2012, : 1275 - 1278
  • [32] TOC estimation from logging data using principal component analysis
    Zhang, Yaxiong
    Wang, Gang
    Wang, Xindong
    Fan, Haitao
    Shen, Bo
    Sun, Ke
    ENERGY GEOSCIENCE, 2023, 4 (04):
  • [33] Estimation of the spectral radiance of a sky element using principal component analysis
    Yatsuzuka, Hideki
    Uetani, Yoshiaki
    Journal of Environmental Engineering (Japan), 2014, 79 (697): : 227 - 232
  • [34] Distributed Estimation for Principal Component Analysis: An Enlarged Eigenspace Analysis
    Chen, Xi
    Lee, Jason D.
    Li, He
    Yang, Yun
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2022, 117 (540) : 1775 - 1786
  • [35] Principal component analysis-based latent-space dimensionality under-estimation, with uncorrelated latent variables
    Hope, Thomas M. H.
    Halai, Ajay
    Crinion, Jenny
    Castelli, Paola
    Price, Cathy J.
    Bowman, Howard
    BRAIN, 2024, 147 (02) : e14 - e16
  • [36] ON DISTRIBUTION OF LARGEST LATENT ROOT AND CORRESPONDING LATENT VECTOR FOR PRINCIPAL COMPONENT ANALYSIS
    SUGIYAMA, T
    ANNALS OF MATHEMATICAL STATISTICS, 1966, 37 (04): : 995 - &
  • [37] Inference of cosmological models with principal component analysis
    Sharma, Ranbir
    Jassal, H. K.
    JOURNAL OF ASTROPHYSICS AND ASTRONOMY, 2024, 45 (02)
  • [38] SPARSE PRINCIPAL COMPONENT ANALYSIS VIA VARIABLE PROJECTION
    Erichson, N. Benjamin
    Zheng, Peng
    Manohar, Krithika
    Brunton, Steven L.
    Kutz, J. Nathan
    Aravkin, Aleksandr Y.
    SIAM JOURNAL ON APPLIED MATHEMATICS, 2020, 80 (02) : 977 - 1002
  • [39] Sparse variable principal component analysis with application to fMRI
    Ulfarsson, Magnus O.
    Solo, Victor
    2007 4TH IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING : MACRO TO NANO, VOLS 1-3, 2007, : 460 - +
  • [40] Variable-Domain Functional Principal Component Analysis
    Johns, Jordan T.
    Crainiceanu, Ciprian
    Zipunnikov, Vadim
    Gellar, Jonathan
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2019, 28 (04) : 993 - 1006