A Data Stream Clustering Algorithm Based on Density and Extended Grid

被引:5
|
作者
Hua, Zheng [1 ,2 ]
Du, Tao [1 ,2 ]
Qu, Shouning [1 ,2 ]
Mou, Guodong [1 ,2 ]
机构
[1] Univ Jinan, Sch Informat Sci & Engn, 336 West Rd Nan Xinzhuang, Jinan 250022, Shandong, Peoples R China
[2] Univ Jinan, Shandong Prov Key Lab Network Based Intelligent C, Jinan 250022, Shandong, Peoples R China
关键词
Density clustering; Grid clustering; Data stream; Spark parallel;
D O I
10.1007/978-3-319-63312-1_61
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Based on the traditional grid density clustering algorithm, proposing A Data Stream Clustering Algorithm Based on Density and Extended Grid (DEGDS). The algorithm combines the advantages of grid clustering algorithm and density clustering algorithm, by improving the defects of clustering parameters by artificially set, get any shape of the cluster. The algorithm uses the local density of each sample point and the distance from the other sample points, determining the number of clustering centers in the grid, and realizing the automatic determination of the clustering center, which avoids the influence of improper selection of initial centroid on clustering results. And in the process of combining the Spark parallel framework for partitioning the data to achieve its parallelization. For data points clustered outside the grid, the clustering within the grid has been effectively expanded by extending the grid, to ensure the accuracy of clustering. Introduced density estimation is connected and grid boundaries to merging grid, saving memory consumption. Using the attenuation factor to incremental update grid density, reflect the evolution of spatial data stream. The experimental results show that compared with the traditional clustering algorithm, the DEGDS algorithm has a large performance improvement in accuracy and efficiency, and can be effectively for large data clustering.
引用
收藏
页码:689 / 699
页数:11
相关论文
共 50 条
  • [31] DWDP-Stream: A Dynamic Weight and Density Peaks Clustering Algorithm for Data Stream
    Chen, Di
    Du, Tao
    Zhou, Jin
    Wu, Yunzheng
    Wang, Xingeng
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2022, 15 (01)
  • [32] DWDP-Stream: A Dynamic Weight and Density Peaks Clustering Algorithm for Data Stream
    Di Chen
    Tao Du
    Jin Zhou
    Yunzheng Wu
    Xingeng Wang
    [J]. International Journal of Computational Intelligence Systems, 15
  • [33] Drifted Data Stream Clustering Based on ClusTree Algorithm
    Zgraja, Jakub
    Wozniak, Michal
    [J]. HYBRID ARTIFICIAL INTELLIGENT SYSTEMS (HAIS 2018), 2018, 10870 : 338 - 349
  • [34] A Data Stream Outlier Detection Algorithm Based on Grid
    Yu Xiang
    Lei Guohua
    Xu Xiandong
    Lin Liandong
    [J]. 2015 27TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2015, : 4136 - 4141
  • [35] Ant Colony Stream Clustering: A Fast Density Clustering Algorithm for Dynamic Data Streams
    Fahy, Conor
    Yang, Shengxiang
    Gongora, Mario
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2019, 49 (06) : 2215 - 2228
  • [36] Data clustering using Hybridization of Clustering Based on Grid and Density with PSO
    Shan, Shi M.
    Deng, Gui S.
    He, Ying H.
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON SERVICE OPERATIONS AND LOGISTICS, AND INFORMATICS (SOLI 2006), PROCEEDINGS, 2006, : 868 - +
  • [37] A Density Clustering Algorithm Based on Data Partitioning
    Li, Dongping
    [J]. PROCEEDINGS OF ANNUAL CONFERENCE OF CHINA INSTITUTE OF COMMUNICATIONS, 2010, : 251 - 254
  • [38] DPCG: an efficient density peaks clustering algorithm based on grid
    Xu, Xiao
    Ding, Shifei
    Du, Mingjing
    Xue, Yu
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2018, 9 (05) : 743 - 754
  • [39] A Hybrid Clustering Algorithm Based on Grid Density and Rough Sets
    Lv Huigang
    Teng Peng
    Huang Jun
    Zhang Fengming
    [J]. PROCEEDINGS OF THE 27TH CHINESE CONTROL CONFERENCE, VOL 4, 2008, : 607 - 611
  • [40] Grid-based clustering algorithm for muilti-density
    Qiu, BZ
    Zhang, XZ
    Shen, JY
    [J]. PROCEEDINGS OF 2005 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-9, 2005, : 1509 - 1512