A grid-based subspace clustering algorithm for high-dimensional data streams

被引:0
|
作者
Sun, Yufen [1 ]
Lu, Yansheng [1 ]
机构
[1] Huazhong Univ Sci & Technol, Coll Comp Sci & Technol, Wuhan 430074, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many applications require the clustering of high-dimensional data streams. We propose a subspace clustering algorithm that can find clusters in different subspaces through one pass over a data stream. The algorithm combines the bottom-up grid-based method and top-down grid-based method. A uniformly partitioned grid data structure is used to summarize the data stream online. The top-down grid partition method is used o find the subspaces in which clusters locate. The errors made by the top-down partition procedure are eliminated by a mergence step in our algorithm. Our performance study with real datasets and synthetic dataset demonstrates the efficiency and effectiveness of our proposed algorithm.
引用
收藏
页码:37 / 48
页数:12
相关论文
共 50 条
  • [31] Subspace Clustering of Very Sparse High-Dimensional Data
    Peng, Hankui
    Pavlidis, Nicos
    Eckley, Idris
    Tsalamanis, Ioannis
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 3780 - 3783
  • [32] IBUSCA: A grid-based bottom-up subspace clustering algorithm
    Glomba, Michal
    Markowska-Kaczmar, Urszula
    ISDA 2006: SIXTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, VOL 1, 2006, : 671 - 676
  • [33] Efficient discovering and maintenance algorithm of subspace clustering over high dimensional data streams
    Department of Computer Science and Engineering, Southeast University, Nanjing 210096, China
    Jisuanji Yanjiu yu Fazhan, 2006, 5 (834-840):
  • [34] Approximate trace of grid-based clusters over high dimensional data streams
    Park, Nam Hun
    Lee, Won Suk
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2007, 4426 : 753 - +
  • [35] Clustering High-Dimensional Data: A Survey on Subspace Clustering, Pattern-Based Clustering, and Correlation Clustering
    Kriegel, Hans-Peter
    Kroeger, Peer
    Zimek, Arthur
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2009, 3 (01)
  • [36] Generalized projected clustering in high-dimensional data streams
    Wang, T
    FRONTIERS OF WWW RESEARCH AND DEVELOPMENT - APWEB 2006, PROCEEDINGS, 2006, 3841 : 772 - 778
  • [38] Persistent homology based clustering algorithm for high-dimensional data
    Xiong Z.
    Wei Y.
    Xiong Z.
    He K.
    Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2024, 52 (02): : 29 - 35
  • [39] A generic framework for efficient subspace clustering of high-dimensional data
    Kriegel, HP
    Kröger, P
    Renz, M
    Wurst, S
    Fifth IEEE International Conference on Data Mining, Proceedings, 2005, : 250 - 257
  • [40] A Survey on High-Dimensional Subspace Clustering
    Qu, Wentao
    Xiu, Xianchao
    Chen, Huangyue
    Kong, Lingchen
    MATHEMATICS, 2023, 11 (02)