Adaptive local Principal Component Analysis improves the clustering of high-dimensional data

被引:2
|
作者
Migenda, Nico [1 ,3 ]
Moeller, Ralf [2 ]
Schenck, Wolfram [1 ]
机构
[1] Bielefeld Univ Appl Sci & Arts, Ctr Appl Data Sci CfADS, Bielefeld, Germany
[2] Bielefeld Univ, Fac Technol, Comp Engn Grp, Bielefeld, Germany
[3] Schulstr 10, D-33330 Gutersloh, Germany
关键词
High-dimensional clustering; Potential function; Adaptive learning rate; Ranking criteria; Neural network-based PCA; Mixture PCA; Local PCA; LEARNING ALGORITHM; DECOMPOSITION; CONVERGENCE;
D O I
10.1016/j.patcog.2023.110030
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In local Principal Component Analysis (PCA), a distribution is approximated by multiple units, each repre-senting a local region by a hyper-ellipsoid obtained through PCA. We present an extension for local PCA which adaptively adjusts both the learning rate of each unit and the potential function which guides the competition between the local units. Our local PCA method is an online neural network method where unit centers and shapes are modified after the presentation of each data point. For several benchmark distributions, we demonstrate that our method improves the overall quality of clustering, especially for high-dimensional distributions where many conventional methods do not perform satisfactorily. Our online method is also well suited for the processing of streaming data: The two adaptive mechanisms lead to a quick reorganization of the clustering when the underlying distribution changes.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Sparse principal component analysis for high-dimensional stationary time series
    Fujimori, Kou
    Goto, Yuichi
    Liu, Yan
    Taniguchi, Masanobu
    [J]. SCANDINAVIAN JOURNAL OF STATISTICS, 2023, 50 (04) : 1953 - 1983
  • [22] High Dimensional Principal Component Analysis with Contaminated Data
    Xu, Huan
    Caramanis, Constantine
    Mannor, Shie
    [J]. ITW: 2009 IEEE INFORMATION THEORY WORKSHOP ON NETWORKING AND INFORMATION THEORY, 2009, : 246 - +
  • [23] A new proposal for a principal component-based test for high-dimensional data applied to the analysis of PhyloChip data
    Ding, Guo-Chun
    Smalla, Kornelia
    Heuer, Holger
    Kropf, Siegfried
    [J]. BIOMETRICAL JOURNAL, 2012, 54 (01) : 94 - 107
  • [24] Lagged principal trend analysis for longitudinal high-dimensional data
    Zhang, Yuping
    [J]. STAT, 2019, 8 (01):
  • [25] Adaptive multi-view subspace clustering for high-dimensional data
    Yan, Fei
    Wang, Xiao-dong
    Zeng, Zhi-qiang
    Hong, Chao-qun
    [J]. PATTERN RECOGNITION LETTERS, 2020, 130 : 299 - 305
  • [26] Joint principal trend analysis for longitudinal high-dimensional data
    Zhang, Yuping
    Ouyang, Zhengqing
    [J]. BIOMETRICS, 2018, 74 (02) : 430 - 438
  • [27] High-dimensional data clustering by using local affine/convex hulls
    Cevikalp, Hakan
    [J]. PATTERN RECOGNITION LETTERS, 2019, 128 : 427 - 432
  • [28] Local-Density Subspace Distributed Clustering for High-Dimensional Data
    Geng, Yangli-ao
    Li, Qingyong
    Liang, Mingfei
    Chi, Chong-Yung
    Tan, Juan
    Huang, Heng
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2020, 31 (08) : 1799 - 1814
  • [29] Local gap density for clustering high-dimensional data with varying densities
    Li, Ruijia
    Yang, Xiaofei
    Qin, Xiaolong
    Zhu, William
    [J]. KNOWLEDGE-BASED SYSTEMS, 2019, 184
  • [30] Clustering of High-Dimensional and Correlated Data
    McLachlan, Geoffrey J.
    Ng, Shu-Kay
    Wang, K.
    [J]. DATA ANALYSIS AND CLASSIFICATION, 2010, : 3 - 11