Nonlinear skeletons of data sets and applications - Methods based on subspace clustering

被引:0
|
作者
Georgiev, Pando G. [1 ]
机构
[1] Univ Cincinnati, Dept Comp Sci, Cincinnati, OH 45221 USA
关键词
D O I
暂无
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In many practical problems the data (given by the columns of a data matrix) lie, up to some precision, on a union of m affine subspaces, called affine skeleton of the data. Finding the best possible skeleton when m = 1 can be performed easily by Principal Component Analysis (PCA). When m > 1, however, this is a global optimization problem, which appears to be hard for bigger m. Its solution reveals an internal structure of the data set, namely, under mild assumptions, the data matrix can be decomposed uniquely (up to some unimportant ambiguities) as multiplication of a mixing matrix and a source matrix, which, more importantly, is sparse. This a kind of Blind Signal Separation problem based on identifiability conditions involving sparseness, which we describe precisely. We develop a subspace clustering algorithm, which is a generalization of the k-plane clustering algorithm, and is suitable for separation of sparse mixtures with bigger sparsity. We present "kernelization" of this idea, leading to the notion of nonlinear Skeletons, when we work in Reproducing Kernel Hilbert Spaces (RKHS). Nonlinear skeleton identification in RKHS is a fusion of two ideas: clustering and Kernel PCA. We propose the idea that binary classification tasks can be performed by finding nonlinear skeletons of the training points in the both classes. We demonstrate our algorithms by examples.
引用
收藏
页码:95 / 108
页数:14
相关论文
共 50 条
  • [1] A scalable parallel subspace clustering algorithm for massive data sets
    Nagesh, HS
    Goil, S
    Choudhary, A
    [J]. 2000 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, PROCEEDINGS, 2000, : 477 - 484
  • [2] NONLINEAR SUBSPACE CLUSTERING
    Zhu, Wencheng
    Lu, Jiwen
    Zhou, Jie
    [J]. 2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 4497 - 4501
  • [3] Nonlinear subspace clustering for image clustering
    Zhu, Wencheng
    Lu, Jiwen
    Zhou, Jie
    [J]. PATTERN RECOGNITION LETTERS, 2018, 107 : 131 - 136
  • [4] Analysis of Recipe Data Based on Subspace Clustering
    Liu, Shan-Zhong
    Song, Xiao-Na
    Wang, Xin-Yong
    [J]. FUZZY SYSTEM AND DATA MINING, 2016, 281 : 323 - 330
  • [5] Data Recovery Technology Based on Subspace Clustering
    Sun, Li
    Song, Bing
    [J]. SCIENTIFIC PROGRAMMING, 2022, 2022
  • [6] Clustering in applications with multiple data sources-A mutual subspace clustering approach
    Hua, Ming
    Pei, Jian
    [J]. NEUROCOMPUTING, 2012, 92 : 133 - 144
  • [7] Fuzzy partition based soft subspace clustering and its applications in high dimensional data
    Wang, Jun
    Wang, Shitong
    Chung, Fulai
    Deng, Zhaohong
    [J]. INFORMATION SCIENCES, 2013, 246 : 133 - 154
  • [8] Nonlinear clustering-based support vector machine for large data sets
    Wang, Yongqiao
    Zhang, Xun
    Wang, Souyang
    Lai, K. K.
    [J]. OPTIMIZATION METHODS & SOFTWARE, 2008, 23 (04): : 533 - 549
  • [9] Practical Path-based Methods for Clustering Arbitrary Shaped Data Sets
    Liu, Cong
    Zhou, Aimin
    Du, Qiannan
    Zhang, Guixu
    [J]. 2013 NINTH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION (ICNC), 2013, : 962 - 966
  • [10] AN EASY-TO-IMPLEMENT FRAMEWORK OF FAST SUBSPACE CLUSTERING FOR BIG DATA SETS
    Meng, Linghang
    Jiao, Yuchen
    Gu, Yuantao
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3612 - 3616