Clustering by principal component analysis with Gaussian kernel in high-dimension, low-sample-size settings

被引:13
|
作者
Nakayama, Yugo [1 ]
Yata, Kazuyoshi [2 ]
Aoshima, Makoto [2 ]
机构
[1] Kyoto Univ, Grad Sch Informat, Kyoto, Japan
[2] Univ Tsukuba, Inst Math, Tsukuba, Ibaraki 3058571, Japan
关键词
HDLSS; Non-linear PCA; PC score; Radial basis function kernel; Spherical data; STATISTICAL SIGNIFICANCE; GEOMETRIC REPRESENTATION; DATA CLASSIFICATION; PCA;
D O I
10.1016/j.jmva.2021.104779
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this paper, we consider clustering based on the kernel principal component analysis (KPCA) for high-dimension, low-sample-size (HDLSS) data. We give theoretical reasons why the Gaussian kernel is effective for clustering high-dimensional data. In addition, we discuss a choice of the scale parameter yielding a high performance of the KPCA with the Gaussian kernel. Finally, we test the performance of the clustering by using microarray data sets. (C) 2021 The Author(s). Published by Elsevier Inc.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] High-dimension, low-sample size perspectives in constrained statistical inference: The SARSCoV RNA genome in illustration
    Sen, Pranab K.
    Tsai, Ming-Tien
    Jou, Yuh-Shan
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2007, 102 (478) : 686 - 694
  • [42] Discriminating Tensor Spectral Clustering for High-Dimension-Low-Sample-Size Data
    Hu, Yu
    Qi, Fei
    Cheung, Yiu-Ming
    Cai, Hongmin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
  • [43] Tensor Robust Principal Component Analysis with Low-Rank Weight Constraints for Sample Clustering
    Zhao, Yu-Ying
    Wang, Mao-Li
    Wang, Juan
    Yuan, Sha-Sha
    Liu, Jin-Xing
    Kong, Xiang-Zhen
    2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 397 - 401
  • [44] PCA Consistency for Non-Gaussian Data in High Dimension, Low Sample Size Context
    Yata, Kazuyoshi
    Aoshima, Makoto
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2009, 38 (16-17) : 2634 - 2652
  • [45] Small sample size fault data recognition based on the principal component analysis and kernel local Fisher discriminant analysis
    Zhao, Rongzhen
    Wang, Xuedong
    Deng, Linfeng
    Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2015, 43 (12): : 12 - 16
  • [46] Dimension reduction in radio maps based on the supervised kernel principal component analysis
    Bing Jia
    Baoqi Huang
    Hepeng Gao
    Wuyungerile Li
    Soft Computing, 2018, 22 : 7697 - 7703
  • [47] Dimension reduction in radio maps based on the supervised kernel principal component analysis
    Jia, Bing
    Huang, Baoqi
    Gao, Hepeng
    Li, Wuyungerile
    SOFT COMPUTING, 2018, 22 (23) : 7697 - 7703
  • [48] Improving the artificial neural network performance by principal components analysis for modeling high-dimension small-sample data sets
    Nga, Nguyen Phuong
    Thuy, Nguyen Thanh
    Binh, Nguyen Ngoc
    PROCEEDINGS OF THE ISSAT INTERNATIONAL CONFERENCE ON MODELING OF COMPLEX SYSTEMS AND ENVIRONMENTS, PROCEEDINGS, 2007, : 6 - +
  • [49] CLUSTERING HIGH DIMENSION, LOW SAMPLE SIZE DATA USING THE MAXIMAL DATA PILING DISTANCE
    Ahn, Jeongyoun
    Lee, Myung Hee
    Yoon, Young Joo
    STATISTICA SINICA, 2012, 22 (02) : 443 - 464
  • [50] Clustering of gamma-ray bursts through kernel principal component analysis
    Modak, Soumita
    Chattopadhyay, Asis Kumar
    Chattopadhyay, Tanuka
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2018, 47 (04) : 1088 - 1102