Principal component analysis and clustering on manifolds

被引:3
|
作者
V. Mardia, Kanti [1 ,2 ]
Wiechers, Henrik [3 ]
Eltzner, Benjamin [4 ]
Huckemann, Stephan F. [3 ]
机构
[1] Univ Leeds, Sch Math, Dept Stat, Leeds LS2 9JT, W Yorkshire, England
[2] Univ Oxford, Dept Stat, Oxford OX1 3LB, England
[3] Georgia Augusta Univ, Felix Bernstein Inst Math Stat Biosci, D-37077 Gottingen, Germany
[4] Max Planck Inst Biophys Chem, D-37077 Gottingen, Germany
关键词
Adaptive linkage clustering; Circular mode hunting; Dimension reduction; Multivariate wrapped normal; SARS-CoV-2; geometry; Stratified spheres; Torus PCA; CHALLENGES; STATISTICS; INFERENCE; SIZER;
D O I
10.1016/j.jmva.2021.104862
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Big data, high dimensional data, sparse data, large scale data, and imaging data are all becoming new frontiers of statistics. Changing technologies have created this flood and have led to a real hunger for new modeling strategies and data analysis by scientists. In many cases data are not Euclidean; for example, in molecular biology, the data sit on manifolds. Even in a simple non-Euclidean manifold (circle), to summarize angles by the arithmetic average cannot make sense and so more care is needed. Thus non-Euclidean settings throw up many major challenges, both mathematical and statistical. This paper will focus on the PCA and clustering methods for some manifolds. Of course, the PCA and clustering methods in multivariate analysis are one of the core topics. We basically deal with two key manifolds from a practical point of view, namely spheres and tori. It is well known that dimension reduction on non-Euclidean manifolds with PCA-like methods has been a challenging task for quite some time but recently there has been some breakthrough. One of them is the idea of nested spheres and another is transforming a torus into a sphere effectively and subsequently use the technology of nested spheres PCA. We also provide a new method of clustering for multivariate analysis which has a fundamental property required for molecular biology that penalizes wrong assignments to avoid chemically no go areas. We give various examples to illustrate these methods. One of the important examples includes dealing with COVID-19 data.
引用
收藏
页数:21
相关论文
共 50 条
  • [21] Dissimilarity Based Principal Component Analysis Using Fuzzy Clustering
    Sato-Ilic, Mika
    [J]. INTEGRATED UNCERTAINTY MANAGEMENT AND APPLICATIONS, 2010, 68 : 453 - 464
  • [22] Principal component clustering approach to teaching quality discriminant analysis
    Xian, Sidong
    Xia, Haibo
    Yin, Yubo
    Zhai, Zhansheng
    Shang, Yan
    [J]. COGENT EDUCATION, 2016, 3
  • [23] Gender classification based on fuzzy clustering and principal component analysis
    Hassanpour, Hamid
    Zehtabian, Amin
    Nazari, Avishan
    Dehghan, Hossein
    [J]. IET COMPUTER VISION, 2016, 10 (03) : 228 - 233
  • [24] Clustering and feature selection using sparse principal component analysis
    Luss, Ronny
    d'Aspremont, Alexandre
    [J]. OPTIMIZATION AND ENGINEERING, 2010, 11 (01) : 145 - 157
  • [25] Principal component analysis of galaxy clustering in hyperspace of galaxy properties
    Zhou, Shuren
    Zhang, Pengjie
    Chen, Ziyang
    [J]. MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2023, 523 (04) : 5789 - 5798
  • [26] Facial Clustering Model upon Principal Component Analysis Databases
    Lee, Wookey
    Park, Simon Soon-Hyoung
    Afshar, Jafar
    Baek, Jongtae
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), 2017, : 1003 - 1007
  • [27] Principal component analysis for Riemannian manifolds, with an application to triangular shape spaces
    Huckemann, Stephan
    Ziezold, Herbert
    [J]. ADVANCES IN APPLIED PROBABILITY, 2006, 38 (02) : 299 - 319
  • [28] Functional Principal Component Analysis for Multiple Variables on Different Riemannian Manifolds
    Wang, Haixu
    Cao, Jiguo
    [J]. JOURNAL OF AGRICULTURAL BIOLOGICAL AND ENVIRONMENTAL STATISTICS, 2024,
  • [29] The Connections between Principal Component Analysis and Dimensionality Reduction Methods of Manifolds
    Li, Bo
    Liu, Jin
    [J]. ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS: WITH ASPECTS OF ARTIFICIAL INTELLIGENCE, 2012, 6839 : 638 - +
  • [30] Use of principal component analysis and hierarchical clustering analysis to evaluate fingerprint residues
    Thomas, Robert
    Kuhns, Teresa
    Zentz, Stephanie
    Egolf, Debra
    [J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2018, 255