Varying density method for data stream clustering

被引:0
|
作者
Mousavi, Maryam [1 ,2 ]
Khotanlou, Hassan [1 ]
Bakar, Azuraliza Abu [2 ]
Vakilian, Mohammadmahdi [3 ]
机构
[1] Department of Computer Engineering, Faculty of Engineering, Bu-Ali Sina University, Hamedan, Iran
[2] Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia
[3] Department of Electrical Engineering, Faculty of Engineering, Hamedan Branch, Islamic Azad University, Hamedan, Iran
来源
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, a new online-offline density-based clustering method for data stream with varying density is proposed. In the online phase, the summary of data is created (often known as micro-clusters) and in the offline phase, this synopsis of data is used to form the final clusters. Finding the accurate micro-clusters is the goal of online phase. When a new data point arrives, the procedure of finding the nearest and best fit micro-cluster is the time consuming process. This procedure can lead to increase the execution time. To address this problem, a new merging algorithm is proposed. For maintaining a limited number of micro-clusters, a pruning process is applied along with the summarization process. In the existing methods, this pruning process takes too long time to remove micro-clusters whose do not receive objects frequently that cause to increase the memory usage. In this paper, to solve this problem, a new pruning algorithm is introduced. Another problem with density-based methods is that they use global parameters in the data sets with varying density that can lead to dramatic decrease in the clustering quality. In our work, to create final clusters, a new density-based algorithm that works based on only MinPts parameter is proposed for increasing the clustering quality of data sets with varying density. The performance evaluation on both synthetic and real data sets illustrates the efficiency and effectiveness of the proposed method. The experimental results show that our method can increase the clustering quality in data sets with varying density along with limited time and memory usage. © 2020 Elsevier B.V.
引用
收藏
相关论文
共 50 条
  • [41] Data stream clustering: a review
    Alaettin Zubaroğlu
    Volkan Atalay
    [J]. Artificial Intelligence Review, 2021, 54 : 1201 - 1236
  • [42] The research on data stream clustering algorithm based on active grid-density
    Department of Mathematics and Computer Science, Tongling University, Tongling, China
    [J]. Zhong, Z., 1600, Asian Research Publishing Network (ARPN) (44):
  • [43] Evolving data stream clustering algorithm based on the shared nearest neighbor density
    [J]. Gao, Bing, 1703, University of Science and Technology Beijing (36):
  • [44] Density Based Self Organizing Incremental Neural Network For Data Stream Clustering
    Xu, Baile
    Shen, Furao
    Zhao, Jinxi
    [J]. 2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 2654 - 2661
  • [45] An on-line density-based clustering algorithm for spatial data stream
    [J]. Yu, Y.-W. (yuyanwei0530@gmail.com), 1600, Science Press (38):
  • [46] An Intrusion Detection Method Based on Damped Window of Data Stream Clustering
    Li, Shengnan
    Zhou, Xiaofeng
    [J]. 2017 NINTH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC 2017), VOL 1, 2017, : 211 - 214
  • [47] Local gap density for clustering high-dimensional data with varying densities
    Li, Ruijia
    Yang, Xiaofei
    Qin, Xiaolong
    Zhu, William
    [J]. KNOWLEDGE-BASED SYSTEMS, 2019, 184
  • [48] Data Density Correlation Degree Clustering Method for Data Aggregation in WSN
    Yuan, Fei
    Zhan, Yiju
    Wang, Yonghua
    [J]. IEEE SENSORS JOURNAL, 2014, 14 (04) : 1089 - 1098
  • [49] Combining density peaks clustering and gravitational search method to enhance data clustering
    Sun, Liping
    Tao, Tao
    Zheng, Xiaoyao
    Bao, Shuting
    Luo, Yonglong
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2019, 85 : 865 - 873
  • [50] Stream Clustering Based on Kernel Density Estimation
    Lodi, Stefano
    Moro, Gianluca
    Sartori, Claudio
    [J]. ECAI 2006, PROCEEDINGS, 2006, 141 : 799 - +