Efficient Density-peaks Clustering Algorithms on Static and Dynamic Data in Euclidean Space

被引:2
|
作者
Amagata, Daichi [1 ]
Hara, Takahiro [1 ]
机构
[1] Osaka Univ, Suita, Osaka, Japan
关键词
Density-peaks clustering; parallel algorithms; multi-dimensional points; SEARCH;
D O I
10.1145/3607873
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering multi-dimensional points is a fundamental task in many fields, and density-based clustering supports many applications because it can discover clusters of arbitrary shapes. This article addresses the problem of Density-Peaks Clustering (DPC) in Euclidean space. DPC already has many applications, but its straightforward implementation incurs O(n(2)) time, where n is the number of points, thereby does not scale to large datasets. To enable DPC on large datasets, we first propose empirically efficient exact DPC algorithm, Ex-DPC. Although this algorithm is much faster than the straightforward implementation, it still suffers from O(n(2)) time theoretically. We hence propose a new exact algorithm, Ex-DPC++, that runs in o(n(2)) time. We accelerate their efficiencies by leveraging multi-threading. Moreover, real-world datasets may have arbitrary updates (point insertions and deletions). It is hence important to support efficient cluster updates. To this end, we propose D-DPC for fully dynamic DPC. We conduct extensive experiments using real datasets, and our experimental results demonstrate that our algorithms are efficient and scalable.
引用
收藏
页数:27
相关论文
共 50 条
  • [31] The Space Complexity of Pass-Efficient Algorithms for Clustering
    Chang, Kevin L.
    Kannan, Ravi
    PROCEEDINGS OF THE SEVENTHEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2006, : 1157 - 1166
  • [32] Fat node leading tree for data stream clustering with density peaks
    Xu, Ji
    Wang, Guoyin
    Li, Tianrui
    Deng, Weihui
    Gou, Guanglei
    KNOWLEDGE-BASED SYSTEMS, 2017, 120 : 99 - 117
  • [33] Density peaks clustering algorithm with nearest neighbor optimization for data with uneven density distribution
    Chen W.-C.
    Zhao J.
    Xiao R.-B.
    Wang H.
    Cui Z.-H.
    Kongzhi yu Juece/Control and Decision, 2024, 39 (03): : 919 - 928
  • [34] Clustering Mixed Data Based on Density Peaks and Stacked Denoising Autoencoders
    Duan, Baobin
    Han, Lixin
    Gou, Zhinan
    Yang, Yi
    Chen, Shuangshuang
    SYMMETRY-BASEL, 2019, 11 (02):
  • [35] CREDIBILISTIC FUZZY CLUSTERING BASED ON ANALYSIS OF DATA DISTRIBUTION DENSITY AND THEIR PEAKS
    Bodyanskiy, Ye, V
    Pliss, I. P.
    Shafronenko, A. Yu
    Kalynychenko, O., V
    RADIO ELECTRONICS COMPUTER SCIENCE CONTROL, 2022, (03) : 58 - 65
  • [36] RCDPeaks: memory-efficient density peaks clustering of long molecular dynamics
    Platero-Rochart, Daniel
    Gonzalez-Aleman, Roy
    Hernandez-Rodriguez, Erix W.
    Leclerc, Fabrice
    Caballero, Julio
    Montero-Cabrera, Luis
    BIOINFORMATICS, 2022, 38 (07) : 1863 - 1869
  • [37] Efficient parallel implementation of a density peaks clustering algorithm on graphics processing unit
    Ke-shi Ge
    Hua-you Su
    Dong-sheng Li
    Xi-cheng Lu
    Frontiers of Information Technology & Electronic Engineering, 2017, 18 : 915 - 927
  • [38] Efficient parallel implementation of a density peaks clustering algorithm on graphics processing unit
    Ge, Ke-shi
    Su, Hua-you
    Li, Dong-sheng
    Lu, Xi-cheng
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2017, 18 (07) : 915 - 927
  • [39] Dynamic and Static Enhanced BIRCH for Functional Data Clustering
    Li, Wang
    Li, Hanfang
    Luo, Youxi
    IEEE ACCESS, 2023, 11 : 111448 - 111465
  • [40] Geometric algorithms for density-based data clustering
    Chen, DZ
    Smid, M
    Xu, B
    INTERNATIONAL JOURNAL OF COMPUTATIONAL GEOMETRY & APPLICATIONS, 2005, 15 (03) : 239 - 260