Nonlinear skeletons of data sets and applications - Methods based on subspace clustering

被引:0
|
作者
Georgiev, Pando G. [1 ]
机构
[1] Univ Cincinnati, Dept Comp Sci, Cincinnati, OH 45221 USA
关键词
D O I
暂无
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In many practical problems the data (given by the columns of a data matrix) lie, up to some precision, on a union of m affine subspaces, called affine skeleton of the data. Finding the best possible skeleton when m = 1 can be performed easily by Principal Component Analysis (PCA). When m > 1, however, this is a global optimization problem, which appears to be hard for bigger m. Its solution reveals an internal structure of the data set, namely, under mild assumptions, the data matrix can be decomposed uniquely (up to some unimportant ambiguities) as multiplication of a mixing matrix and a source matrix, which, more importantly, is sparse. This a kind of Blind Signal Separation problem based on identifiability conditions involving sparseness, which we describe precisely. We develop a subspace clustering algorithm, which is a generalization of the k-plane clustering algorithm, and is suitable for separation of sparse mixtures with bigger sparsity. We present "kernelization" of this idea, leading to the notion of nonlinear Skeletons, when we work in Reproducing Kernel Hilbert Spaces (RKHS). Nonlinear skeleton identification in RKHS is a fusion of two ideas: clustering and Kernel PCA. We propose the idea that binary classification tasks can be performed by finding nonlinear skeletons of the training points in the both classes. We demonstrate our algorithms by examples.
引用
下载
收藏
页码:95 / 108
页数:14
相关论文
共 50 条
  • [21] A novel subspace clustering method based on data cohesion model
    Zhang, Huirong
    Tang, Yan
    He, Ying
    Mou, Chunqian
    Xu, Pingan
    Shi, Jiaokai
    OPTIK, 2016, 127 (20): : 8513 - 8519
  • [22] A data stream subspace clustering algorithm based on region partition
    Yu, X. (yuxpointfly@gmail.com), 1600, Science Press (51):
  • [23] PARAFAC-based Multilinear Subspace Clustering for Tensor data
    Traganitis, Panagiotis A.
    Giannakis, Georgios B.
    2016 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2016, : 1280 - 1284
  • [24] Constraint Based Subspace Clustering for High Dimensional Uncertain Data
    Zhang, Xianchao
    Gao, Lu
    Yu, Hong
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2016, PT II, 2016, 9652 : 271 - 282
  • [25] Segregation-Based Subspace Clustering for Huge Dimensional Data
    Alsagabi, Majid I.
    Twefik, Ahmed H.
    2010 IEEE INTERNATIONAL WORKSHOP ON GENOMIC SIGNAL PROCESSING AND STATISTICS (GENSIPS), 2010,
  • [26] Rough subspace-based clustering ensemble for categorical data
    Gao, Can
    Pedrycz, Witold
    Miao, Duoqian
    SOFT COMPUTING, 2013, 17 (09) : 1643 - 1658
  • [27] Subspace Methods for Nonlinear Optimization
    Liu, Xin
    Wen, Zaiwen
    Yuan, Ya-Xiang
    CSIAM TRANSACTIONS ON APPLIED MATHEMATICS, 2021, 2 (04): : 585 - 651
  • [28] Clustering High-Dimensional Data: A Survey on Subspace Clustering, Pattern-Based Clustering, and Correlation Clustering
    Kriegel, Hans-Peter
    Kroeger, Peer
    Zimek, Arthur
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2009, 3 (01)
  • [29] Nonlinear mapping of massive data sets by fuzzy clustering and neural networks
    Rassokhin, DN
    Lobanov, VS
    Agrafiotis, DK
    JOURNAL OF COMPUTATIONAL CHEMISTRY, 2001, 22 (04) : 373 - 386
  • [30] Geometric weighting subspace clustering on nonlinear manifolds
    Liu, Shujun
    Wang, Huajun
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (30) : 42971 - 42990