A Clustering Algorithm Based on Density-Grid for Stream Data

被引:5
|
作者
Zhang, Dandan [1 ]
Tian, Hui [1 ]
Sang, Yingpeng [1 ]
Li, Yidong [1 ]
Wu, Yanbo [1 ]
Wu, Jun [1 ]
Shen, Hong [1 ]
机构
[1] Beijing Jiaotong Univ, Sch Comp & Informat Technol, Beijing, Peoples R China
关键词
Clustering; stream data; density-grid; Index Tree;
D O I
10.1109/PDCAT.2012.13
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Many real applications, such as network traffic monitoring, intrusion detection, satellite remote sensing, and electronic business, generate data in the form of a stream arriving continuously at high speed. Clustering is an important data analysis tool for knowledge discovery. Compared with traditional clustering algorithms, clustering stream data is an improtant and challenging problem which has attracted many researchers. Clustering stream data is facing two main challenges. First, as the data is continuously arriving with high rate and the computer storage capacity is limited, raw data can only be scaned in one pass. Second, stream data is always changing with time, so viewing a data stream as a set of static data can deteriorate the clustering quality. In fact, users are more concerned with the evolving behaviors of clusters which can help people making correct decisions. This paper proposes a density-grid based clustering algorithm, PKS-Stream-I, for stream data. It is an optimization of PKS-Stream in density detection period selection, sporadic grid detection and removal. Empirical results show the proposed method yields out better performance.
引用
收藏
页码:398 / 403
页数:6
相关论文
共 50 条
  • [1] A Density-Grid Based Clustering Algorithm on Data Stream Using Resilient Distributed Datasets
    Zhang, Yuan
    Zhang, Jiongmin
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, AI 2016, 2016, 9673 : 316 - 322
  • [2] Clustering Algorithm Based on Grid and Density for Data Stream
    Wang, Lang
    Li, Haiqing
    [J]. MATERIALS SCIENCE, ENERGY TECHNOLOGY, AND POWER ENGINEERING I, 2017, 1839
  • [3] A Distributed Density-Grid Clustering Algorithm for Multi-Dimensional Data
    Brown, Daniel
    Shi, Yong
    [J]. 2020 10TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE (CCWC), 2020, : 1 - 7
  • [4] A Density Granularity Grid Clustering Algorithm Based on Data Stream
    Wang, Li-fang
    Han, Xie
    [J]. EMERGING RESEARCH IN WEB INFORMATION SYSTEMS AND MINING, 2011, 238 : 113 - 120
  • [5] A Data Stream Clustering Algorithm Based on Density and Extended Grid
    Hua, Zheng
    Du, Tao
    Qu, Shouning
    Mou, Guodong
    [J]. INTELLIGENT COMPUTING THEORIES AND APPLICATION, ICIC 2017, PT II, 2017, 10362 : 689 - 699
  • [6] A Fast Density-Grid Based Clustering Method
    Brown, Daniel
    Japa, Arialdis
    Shi, Yong
    [J]. 2019 IEEE 9TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE (CCWC), 2019, : 48 - 54
  • [7] A Grid and Density-based Clustering Algorithm for Processing Data Stream
    Jia, Chen
    Tan, ChengYu
    Yong, Ai
    [J]. SECOND INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTING: WGEC 2008, PROCEEDINGS, 2008, : 517 - +
  • [8] A Kind of Data Stream Clustering Algorithm Based on Grid-Density
    Zhong Zhishui
    [J]. ADVANCES IN COMPUTER SCIENCE, ENVIRONMENT, ECOINFORMATICS, AND EDUCATION, PT II, 2011, 215 : 418 - 423
  • [9] Research on Parallel Data Stream Clustering Algorithm based on Grid and Density
    Hu, Weihua
    Cheng, Mingzhong
    Wu, Guoping
    Wu, Liang
    [J]. 2015 International Conference on Computer Science and Mechanical Automation (CSMA), 2015, : 70 - 75
  • [10] Outlier mining algorithm based on data-partitioning and density-grid
    Xing, Chang Zheng
    Tang, Cheng Long
    Wei, Ke
    [J]. 2012 INTERNATIONAL CONFERENCE ON CONTROL ENGINEERING AND COMMUNICATION TECHNOLOGY (ICCECT 2012), 2012, : 880 - 884