Exploring dimension learning via a penalized probabilistic principal component analysis

被引:3
|
作者
Deng, Wei Q. [1 ,2 ]
Craiu, Radu, V [3 ]
机构
[1] McMaster Univ, Dept Psychiat & Behav Neurosci, Hamilton, ON, Canada
[2] St Josephs Healthcare Hamilton, Peter Boris Ctr Addict Res, Hamilton, ON, Canada
[3] Univ Toronto, Dept Stat Sci, Toronto, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Dimension estimation; model selection; penalization; principal component analysis; probabilistic principal component analysis; profile likelihood; SELECTION; COVARIANCE; NUMBER; EIGENVALUES; SHRINKAGE; TESTS;
D O I
10.1080/00949655.2022.2100890
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Establishing a low-dimensional representation of the data leads to efficient data learning strategies. In many cases, the reduced dimension needs to be explicitly stated and estimated from the data. We explore the estimation of dimension in finite samples as a constrained optimization problem, where the estimated dimension is a maximizer of a penalized profile likelihood criterion within the framework of a probabilistic principal components analysis. Unlike other penalized maximization problems that require an 'optimal' penalty tuning parameter, we propose a data-averaging procedure whereby the estimated dimension emerges as the most favourable choice over a range of plausible penalty parameters. The proposed heuristic is compared to a large number of alternative criteria in simulations and an application to gene expression data. Extensive simulation studies reveal that none of the methods uniformly dominate the other and highlight the importance of subject-specific knowledge in choosing statistical methods for dimension learning. Our application results also suggest that gene expression data have a higher intrinsic dimension than previously thought. Overall, our proposed heuristic strikes a good balance and is the method of choice when model assumptions deviated moderately.
引用
收藏
页码:266 / 297
页数:32
相关论文
共 50 条
  • [1] Penalized Preimage Learning in Kernel Principal Component Analysis
    Zheng, Wei-Shi
    Lai, JianHuang
    Yuen, Pong C.
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 2010, 21 (04): : 551 - 570
  • [2] Penalized principal component analysis using smoothing
    Rebecca Hurwitz
    Georg Hahn
    Statistics and Computing, 2025, 35 (3)
  • [3] Penalized Principal Component Analysis of Microarray Data
    Nikulin, Vladimir
    McLachlan, Geoffrey J.
    COMPUTATIONAL INTELLIGENCE METHODS FOR BIOINFORMATICS AND BIOSTATISTICS, 2010, 6160 : 82 - 96
  • [4] Probabilistic principal component analysis
    Tipping, ME
    Bishop, CM
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1999, 61 : 611 - 622
  • [5] Penalized spline models for functional principal component analysis
    Yao, F
    Lee, TCM
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2006, 68 : 3 - 25
  • [6] Probabilistic Disjoint Principal Component Analysis
    Ferrara, Carla
    Martella, Francesca
    Vichi, Maurizio
    MULTIVARIATE BEHAVIORAL RESEARCH, 2019, 54 (01) : 47 - 61
  • [7] Bilinear Probabilistic Principal Component Analysis
    Zhao, Jianhua
    Yu, Philip L. H.
    Kwok, James T.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2012, 23 (03) : 492 - 503
  • [8] Torus Probabilistic Principal Component Analysis
    Nodehi, Anahita
    Golalizadeh, Mousa
    Maadooliat, Mehdi
    Agostinelli, Claudio
    JOURNAL OF CLASSIFICATION, 2025,
  • [9] Enhancing NILM classification via robust principal component analysis dimension reduction
    Yaniv, Arbel
    Beck, Yuval
    HELIYON, 2024, 10 (09)
  • [10] Dimension reduction in principal component analysis for trees
    Alfaro, Carlos A.
    Aydin, Burcu
    Valencia, Carlos E.
    Bullitt, Elizabeth
    Ladha, Alim
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2014, 74 : 157 - 179