A grid-based subspace clustering algorithm for high-dimensional data streams

被引:0
|
作者
Sun, Yufen [1 ]
Lu, Yansheng [1 ]
机构
[1] Huazhong Univ Sci & Technol, Coll Comp Sci & Technol, Wuhan 430074, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many applications require the clustering of high-dimensional data streams. We propose a subspace clustering algorithm that can find clusters in different subspaces through one pass over a data stream. The algorithm combines the bottom-up grid-based method and top-down grid-based method. A uniformly partitioned grid data structure is used to summarize the data stream online. The top-down grid partition method is used o find the subspaces in which clusters locate. The errors made by the top-down partition procedure are eliminated by a mergence step in our algorithm. Our performance study with real datasets and synthetic dataset demonstrates the efficiency and effectiveness of our proposed algorithm.
引用
收藏
页码:37 / 48
页数:12
相关论文
共 50 条
  • [41] ICE: Incremental Subspace Clustering of High-Dimensional Categorical Data
    Pang, Ning
    Zhang, Chaowei
    Zhang, Jifu
    Qin, Xiao
    INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2025, 33 (01) : 87 - 118
  • [42] Density-connected subspace clustering for high-dimensional data
    Kailing, K
    Kriegel, HP
    Kröger, P
    PROCEEDINGS OF THE FOURTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2004, : 246 - 256
  • [43] Subspace-Weighted Consensus Clustering for High-Dimensional Data
    Cai, Xiaosha
    Huang, Dong
    ADVANCED DATA MINING AND APPLICATIONS, 2020, 12447 : 3 - 16
  • [44] An algorithm for high-dimensional traffic data clustering
    Zheng, Pengjun
    McDonald, Mike
    FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2006, 4223 : 59 - 68
  • [45] A fuzzy subspace algorithm for clustering high dimensional data
    Can, Guojun
    Wu, Jianhong
    Yang, Zijiang
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2006, 4093 : 271 - 278
  • [46] Parallel Clustering of High-Dimensional Social Media Data Streams
    Gao, Xiaoming
    Ferrara, Emilio
    Qiu, Judy
    2015 15TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING, 2015, : 323 - 332
  • [47] A density grid-based uncertain data stream clustering algorithm
    Zhao, J. (jintianzhao@yahoo.com), 1600, Binary Information Press (10):
  • [48] An entropy weighting k-means algorithm for subspace clustering of high-dimensional sparse data
    Jing, Liping
    Ng, Michael K.
    Huang, Joshua Zhexue
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2007, 19 (08) : 1026 - 1041
  • [49] Grid-based indexing and search algorithms for large-scale and high-dimensional data
    Yang, Chuanfu
    Li, Zhiyang
    Qu, Wenyu
    Liu, Zhaobin
    Qi, Heng
    2017 14TH INTERNATIONAL SYMPOSIUM ON PERVASIVE SYSTEMS, ALGORITHMS AND NETWORKS & 2017 11TH INTERNATIONAL CONFERENCE ON FRONTIER OF COMPUTER SCIENCE AND TECHNOLOGY & 2017 THIRD INTERNATIONAL SYMPOSIUM OF CREATIVE COMPUTING (ISPAN-FCST-ISCC), 2017, : 46 - 51
  • [50] A NOVEL GRID-BASED CLUSTERING ALGORITHM
    Starczewski, Artur
    Scherer, Magdalena M.
    Ksiazek, Wojciech
    Debski, Maciej
    Wang, Lipo
    JOURNAL OF ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING RESEARCH, 2021, 11 (04) : 319 - 330