Clustering by principal component analysis with Gaussian kernel in high-dimension, low-sample-size settings

被引:13
|
作者
Nakayama, Yugo [1 ]
Yata, Kazuyoshi [2 ]
Aoshima, Makoto [2 ]
机构
[1] Kyoto Univ, Grad Sch Informat, Kyoto, Japan
[2] Univ Tsukuba, Inst Math, Tsukuba, Ibaraki 3058571, Japan
关键词
HDLSS; Non-linear PCA; PC score; Radial basis function kernel; Spherical data; STATISTICAL SIGNIFICANCE; GEOMETRIC REPRESENTATION; DATA CLASSIFICATION; PCA;
D O I
10.1016/j.jmva.2021.104779
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this paper, we consider clustering based on the kernel principal component analysis (KPCA) for high-dimension, low-sample-size (HDLSS) data. We give theoretical reasons why the Gaussian kernel is effective for clustering high-dimensional data. In addition, we discuss a choice of the scale parameter yielding a high performance of the KPCA with the Gaussian kernel. Finally, we test the performance of the clustering by using microarray data sets. (C) 2021 The Author(s). Published by Elsevier Inc.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Population structure-learned classifier for high-dimension low-sample-size class-imbalanced problem
    Shen, Liran
    Er, Meng Joo
    Liu, Weijiang
    Fan, Yunsheng
    Yin, Qingbo
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 111
  • [22] Effective PCA for high-dimension, low-sample-size data with singular value decomposition of cross data matrix
    Yata, Kazuyoshi
    Aoshima, Makoto
    JOURNAL OF MULTIVARIATE ANALYSIS, 2010, 101 (09) : 2060 - 2077
  • [23] Asymptotic properties of distance-weighted discrimination and its bias correction for high-dimension, low-sample-size data
    Egashira, Kento
    Yata, Kazuyoshi
    Aoshima, Makoto
    JAPANESE JOURNAL OF STATISTICS AND DATA SCIENCE, 2021, 4 (02) : 821 - 840
  • [24] Classification for high-dimension low-sample size data
    Shen, Liran
    Er, Meng Joo
    Yin, Qingbo
    PATTERN RECOGNITION, 2022, 130
  • [25] Classification for high-dimension low-sample size data
    Shen, Liran
    Er, Meng Joo
    Yin, Qingbo
    PATTERN RECOGNITION, 2022, 130
  • [26] Asymptotic properties of distance-weighted discrimination and its bias correction for high-dimension, low-sample-size data
    Kento Egashira
    Kazuyoshi Yata
    Makoto Aoshima
    Japanese Journal of Statistics and Data Science, 2021, 4 : 821 - 840
  • [27] Correction to: Asymptotic properties of distance-weighted discrimination and its bias correction for high-dimension, low-sample-size data
    Kento Egashira
    Kazuyoshi Yata
    Makoto Aoshima
    Japanese Journal of Statistics and Data Science, 2022, 5 : 717 - 718
  • [28] Design of input assignment and feedback gain for re-stabilizing undirected networks with High-Dimension Low-Sample-Size data
    Yasukata, Hitoshi
    Shen, Xun
    Sasahara, Hampei
    Imura, Jun-ichi
    Oku, Makito
    Aihara, Kazuyuki
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2023, 33 (12) : 6734 - 6753
  • [29] Ultra-early medical treatment-oriented system identification using High-Dimension Low-Sample-Size data
    Shen, Xun
    Shimada, Naruto
    Sasahara, Hampei
    Imura, Jun-ichi
    IFAC JOURNAL OF SYSTEMS AND CONTROL, 2024, 27
  • [30] On Perfect Clustering of High Dimension, Low Sample Size Data
    Sarkar, Soham
    Ghosh, Anil K.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (09) : 2257 - 2272