Varying density method for data stream clustering

被引:0
|
作者
Mousavi, Maryam [1 ,2 ]
Khotanlou, Hassan [1 ]
Bakar, Azuraliza Abu [2 ]
Vakilian, Mohammadmahdi [3 ]
机构
[1] Department of Computer Engineering, Faculty of Engineering, Bu-Ali Sina University, Hamedan, Iran
[2] Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia
[3] Department of Electrical Engineering, Faculty of Engineering, Hamedan Branch, Islamic Azad University, Hamedan, Iran
来源
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, a new online-offline density-based clustering method for data stream with varying density is proposed. In the online phase, the summary of data is created (often known as micro-clusters) and in the offline phase, this synopsis of data is used to form the final clusters. Finding the accurate micro-clusters is the goal of online phase. When a new data point arrives, the procedure of finding the nearest and best fit micro-cluster is the time consuming process. This procedure can lead to increase the execution time. To address this problem, a new merging algorithm is proposed. For maintaining a limited number of micro-clusters, a pruning process is applied along with the summarization process. In the existing methods, this pruning process takes too long time to remove micro-clusters whose do not receive objects frequently that cause to increase the memory usage. In this paper, to solve this problem, a new pruning algorithm is introduced. Another problem with density-based methods is that they use global parameters in the data sets with varying density that can lead to dramatic decrease in the clustering quality. In our work, to create final clusters, a new density-based algorithm that works based on only MinPts parameter is proposed for increasing the clustering quality of data sets with varying density. The performance evaluation on both synthetic and real data sets illustrates the efficiency and effectiveness of the proposed method. The experimental results show that our method can increase the clustering quality in data sets with varying density along with limited time and memory usage. © 2020 Elsevier B.V.
引用
收藏
相关论文
共 50 条
  • [1] Varying density method for data stream clustering
    Mousavi, Maryam
    Khotanlou, Hassan
    Abu Bakar, Azuraliza
    Vakilian, Mohammadmahdi
    [J]. APPLIED SOFT COMPUTING, 2020, 97
  • [2] An Adaptive Density Data Stream Clustering Algorithm
    Shifei Ding
    Jian Zhang
    Hongjie Jia
    Jun Qian
    [J]. Cognitive Computation, 2016, 8 : 30 - 38
  • [3] An Adaptive Density Data Stream Clustering Algorithm
    Ding, Shifei
    Zhang, Jian
    Jia, Hongjie
    Qian, Jun
    [J]. COGNITIVE COMPUTATION, 2016, 8 (01) : 30 - 38
  • [5] IMPROVED DENSITY BASED ALGORITHM FOR DATA STREAM CLUSTERING
    Mousavi, Maryam
    Abu Bakar, Azuraliza
    [J]. JURNAL TEKNOLOGI, 2015, 77 (18): : 73 - 77
  • [6] Clustering Stream Data by Exploring the Evolution of Density Mountain
    Gong, Shufeng
    Zhang, Yanfeng
    Yu, Ge
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2017, 11 (04): : 393 - 405
  • [7] Stream Data Clustering Based on Grid Density and Attraction
    Tu, Li
    Chen, Yixin
    [J]. ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2009, 3 (03)
  • [8] THE CLUSTERING ALGORITHM OF EVOLUTIONAL DATA STREAM BASED ON DENSITY
    Meng, Yuyu
    Zheng, Liying
    [J]. 3RD INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND COMPUTER SCIENCE (ITCS 2011), PROCEEDINGS, 2011, : 473 - 477
  • [9] Clustering Algorithm Based on Grid and Density for Data Stream
    Wang, Lang
    Li, Haiqing
    [J]. MATERIALS SCIENCE, ENERGY TECHNOLOGY, AND POWER ENGINEERING I, 2017, 1839
  • [10] MuDi-Stream: A multi density clustering algorithm for evolving data stream
    Amini, Amineh
    Saboohi, Hadi
    Herawan, Tutut
    Teh Ying Wah
    [J]. JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2016, 59 : 370 - 385