ASCRClu: an adaptive subspace combination and reduction algorithm for clustering of high-dimensional data

被引:5
|
作者
Fatehi, Kavan [1 ]
Rezvani, Mohsen [2 ]
Fateh, Mansoor [2 ]
机构
[1] Yazd Univ, Yazd, Iran
[2] Shahrood Univ Technol, Shahrood, Iran
关键词
High-dimensional data; Subspace clustering; Cluster similarity; DENSITY;
D O I
10.1007/s10044-020-00884-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The curse of dimensionality in high-dimensional data is one of the major challenges in data clustering. Recently, a considerable amount of literature has been published on subspace clustering to address this challenge. The main objective of the subspace clustering is to discover clusters embedded in any possible combination of the attributes. Previous studies have mostly been generating redundant subspace clusters, leading to clustering accuracy loss and also increasing the running time. In this paper, a bottom-up density-based approach is proposed for clustering of high-dimensional data. We employ the cluster structure as a similarity measure to generate the optimal subspaces which result in raising the accuracy of the subspace clustering. Using this idea, we propose an iterative algorithm to discover similar subspaces using the similarity in the features of subspaces. At each iteration of this algorithm, it first determines similar subspaces, then combines them to generate higher-dimensional subspaces, and finally re-clusters the subspaces. The algorithm repeats these steps and converges to the final clusters. Experiments on various synthetic and real datasets show that the results of the proposed approach are significantly better in both quality and runtime comparing to the state of the art on clustering high-dimensional data. The accuracy of the proposed method is around 34% higher than the CLIQUE algorithm and around 6% higher than DiSH.
引用
收藏
页码:1651 / 1663
页数:13
相关论文
共 50 条
  • [1] ASCRClu: an adaptive subspace combination and reduction algorithm for clustering of high-dimensional data
    Kavan Fatehi
    Mohsen Rezvani
    Mansoor Fateh
    Pattern Analysis and Applications, 2020, 23 : 1651 - 1663
  • [2] Evolutionary Subspace Clustering Algorithm for High-Dimensional Data
    Nourashrafeddin, S. N.
    Arnold, Dirk V.
    Milios, Evangelos
    PROCEEDINGS OF THE FOURTEENTH INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTATION COMPANION (GECCO'12), 2012, : 1497 - 1498
  • [3] Adaptive multi-view subspace clustering for high-dimensional data
    Yan, Fei
    Wang, Xiao-dong
    Zeng, Zhi-qiang
    Hong, Chao-qun
    PATTERN RECOGNITION LETTERS, 2020, 130 : 299 - 305
  • [4] A novel algorithm for fast and scalable subspace clustering of high-dimensional data
    Kaur A.
    Datta A.
    Journal of Big Data, 2015, 2 (01)
  • [5] Subspace selection for clustering high-dimensional data
    Baumgartner, C
    Plant, C
    Kailing, K
    Kriegel, HP
    Kröger, P
    FOURTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2004, : 11 - 18
  • [6] Fast Adaptive K-Means Subspace Clustering for High-Dimensional Data
    Wang, Xiao-Dong
    Chen, Rung-Ching
    Yan, Fei
    Zeng, Zhi-Qiang
    Hong, Chao-Qun
    IEEE ACCESS, 2019, 7 : 42639 - 42651
  • [7] A grid-based subspace clustering algorithm for high-dimensional data streams
    Sun, Yufen
    Lu, Yansheng
    WEB INFORMATION SYSTEMS - WISE 2006 WORKSHOPS, PROCEEDINGS, 2006, 4256 : 37 - 48
  • [8] Subspace clustering of high-dimensional data: a predictive approach
    Brian McWilliams
    Giovanni Montana
    Data Mining and Knowledge Discovery, 2014, 28 : 736 - 772
  • [9] Subspace Clustering of High-Dimensional Data: An Evolutionary Approach
    Vijendra, Singh
    Laxman, Sahoo
    APPLIED COMPUTATIONAL INTELLIGENCE AND SOFT COMPUTING, 2013, 2013
  • [10] Density Conscious Subspace Clustering for High-Dimensional Data
    Chu, Yi-Hong
    Huang, Jen-Wei
    Chuang, Kun-Ta
    Yang, De-Nian
    Chen, Ming-Syan
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2010, 22 (01) : 16 - 30