Stability approach to selecting the number of principal components

被引:4
|
作者
Song, Jiyeon [1 ]
Shin, Seung Jun [1 ]
机构
[1] Korea Univ, Dept Stat, 45 Anam Ro, Seoul 02841, South Korea
基金
新加坡国家研究基金会;
关键词
Principal component analysis; Stability selection; Structural dimension; Subsampling; SLICED INVERSE REGRESSION; CHOICE;
D O I
10.1007/s00180-018-0826-7
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Principal component analysis (PCA) is a canonical tool that reduces data dimensionality by finding linear transformations that project the data into a lower dimensional subspace while preserving the variability of the data. Selecting the number of principal components (PC) is essential but challenging for PCA since it represents an unsupervised learning problem without a clear target label at the sample level. In this article, we propose a new method to determine the optimal number of PCs based on the stability of the space spanned by PCs. A series of analyses with both synthetic data and real data demonstrates the superior performance of the proposed method.
引用
收藏
页码:1923 / 1938
页数:16
相关论文
共 50 条
  • [31] Selecting the number of factors in principal component analysis by permutation testingNumerical and practical aspects
    Vitale, Raffaele
    Westerhuis, Johan A.
    Naes, Tormod
    Smilde, Age K.
    de Noord, Onno E.
    Ferrer, Alberto
    JOURNAL OF CHEMOMETRICS, 2017, 31 (12)
  • [32] A principal components approach to combining regression estimates
    Merz, CJ
    Pazzani, MJ
    MACHINE LEARNING, 1999, 36 (1-2) : 9 - 32
  • [33] Reformulation of Bayesian Geostatistical Approach on Principal Components
    Zhao, Yue
    Luo, Jian
    WATER RESOURCES RESEARCH, 2020, 56 (04)
  • [34] Using Model Selection Criteria to Choose the Number of Principal Components
    Stanley L. Sclove
    Journal of Statistical Theory and Applications, 2021, 20 : 450 - 461
  • [35] A Principal Components Approach to Combining Regression Estimates
    Christopher J. Merz
    Michael J. Pazzani
    Machine Learning, 1999, 36 : 9 - 32
  • [36] Simultaneous Estimation of the Number of Principal Components and Kernel Parameter in KPCA
    Fu, Yujia
    Tao, Hongfeng
    Yang, Huizhong
    2017 6TH INTERNATIONAL SYMPOSIUM ON ADVANCED CONTROL OF INDUSTRIAL PROCESSES (ADCONIP), 2017, : 149 - 154
  • [37] Using Model Selection Criteria to Choose the Number of Principal Components
    Sclove, Stanley L.
    JOURNAL OF STATISTICAL THEORY AND APPLICATIONS, 2021, 20 (03): : 450 - 461
  • [38] A penalized likelihood approach to rotation of principal components
    Park, T
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2005, 14 (04) : 867 - 888
  • [39] A SPREADSHEET APPROACH TO PRINCIPAL COMPONENTS-ANALYSIS
    KACIAK, E
    KOCZKODAJ, WW
    JOURNAL OF MICROCOMPUTER APPLICATIONS, 1989, 12 (03): : 281 - 291
  • [40] Selection of number of principal components for de-noising signals
    Koutsogiannis, GS
    Soraghan, JJ
    ELECTRONICS LETTERS, 2002, 38 (13) : 664 - 666