A Data Stream Clustering Algorithm Based on Density and Extended Grid

被引:5
|
作者
Hua, Zheng [1 ,2 ]
Du, Tao [1 ,2 ]
Qu, Shouning [1 ,2 ]
Mou, Guodong [1 ,2 ]
机构
[1] Univ Jinan, Sch Informat Sci & Engn, 336 West Rd Nan Xinzhuang, Jinan 250022, Shandong, Peoples R China
[2] Univ Jinan, Shandong Prov Key Lab Network Based Intelligent C, Jinan 250022, Shandong, Peoples R China
关键词
Density clustering; Grid clustering; Data stream; Spark parallel;
D O I
10.1007/978-3-319-63312-1_61
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Based on the traditional grid density clustering algorithm, proposing A Data Stream Clustering Algorithm Based on Density and Extended Grid (DEGDS). The algorithm combines the advantages of grid clustering algorithm and density clustering algorithm, by improving the defects of clustering parameters by artificially set, get any shape of the cluster. The algorithm uses the local density of each sample point and the distance from the other sample points, determining the number of clustering centers in the grid, and realizing the automatic determination of the clustering center, which avoids the influence of improper selection of initial centroid on clustering results. And in the process of combining the Spark parallel framework for partitioning the data to achieve its parallelization. For data points clustered outside the grid, the clustering within the grid has been effectively expanded by extending the grid, to ensure the accuracy of clustering. Introduced density estimation is connected and grid boundaries to merging grid, saving memory consumption. Using the attenuation factor to incremental update grid density, reflect the evolution of spatial data stream. The experimental results show that compared with the traditional clustering algorithm, the DEGDS algorithm has a large performance improvement in accuracy and efficiency, and can be effectively for large data clustering.
引用
收藏
页码:689 / 699
页数:11
相关论文
共 50 条
  • [1] Clustering Algorithm Based on Grid and Density for Data Stream
    Wang, Lang
    Li, Haiqing
    [J]. MATERIALS SCIENCE, ENERGY TECHNOLOGY, AND POWER ENGINEERING I, 2017, 1839
  • [2] A Density Granularity Grid Clustering Algorithm Based on Data Stream
    Wang, Li-fang
    Han, Xie
    [J]. EMERGING RESEARCH IN WEB INFORMATION SYSTEMS AND MINING, 2011, 238 : 113 - 120
  • [3] A Clustering Algorithm Based on Density-Grid for Stream Data
    Zhang, Dandan
    Tian, Hui
    Sang, Yingpeng
    Li, Yidong
    Wu, Yanbo
    Wu, Jun
    Shen, Hong
    [J]. 2012 13TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS, AND TECHNOLOGIES (PDCAT 2012), 2012, : 398 - 403
  • [4] A Grid and Density-based Clustering Algorithm for Processing Data Stream
    Jia, Chen
    Tan, ChengYu
    Yong, Ai
    [J]. SECOND INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTING: WGEC 2008, PROCEEDINGS, 2008, : 517 - +
  • [5] A Kind of Data Stream Clustering Algorithm Based on Grid-Density
    Zhong Zhishui
    [J]. ADVANCES IN COMPUTER SCIENCE, ENVIRONMENT, ECOINFORMATICS, AND EDUCATION, PT II, 2011, 215 : 418 - 423
  • [6] Research on Parallel Data Stream Clustering Algorithm based on Grid and Density
    Hu, Weihua
    Cheng, Mingzhong
    Wu, Guoping
    Wu, Liang
    [J]. 2015 International Conference on Computer Science and Mechanical Automation (CSMA), 2015, : 70 - 75
  • [7] Stream Data Clustering Based on Grid Density and Attraction
    Tu, Li
    Chen, Yixin
    [J]. ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2009, 3 (03)
  • [8] IMPROVED DENSITY BASED ALGORITHM FOR DATA STREAM CLUSTERING
    Mousavi, Maryam
    Abu Bakar, Azuraliza
    [J]. JURNAL TEKNOLOGI, 2015, 77 (18): : 73 - 77
  • [9] THE CLUSTERING ALGORITHM OF EVOLUTIONAL DATA STREAM BASED ON DENSITY
    Meng, Yuyu
    Zheng, Liying
    [J]. 3RD INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND COMPUTER SCIENCE (ITCS 2011), PROCEEDINGS, 2011, : 473 - 477
  • [10] An Incremental Algorithm Based on Irregular Grid for Clustering Data Stream
    Yin, Guisheng
    Yu, Xiang
    Yang, Guang
    [J]. 2008 4TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-31, 2008, : 5680 - 5684