Efficient Density-peaks Clustering Algorithms on Static and Dynamic Data in Euclidean Space

被引:2
|
作者
Amagata, Daichi [1 ]
Hara, Takahiro [1 ]
机构
[1] Osaka Univ, Suita, Osaka, Japan
关键词
Density-peaks clustering; parallel algorithms; multi-dimensional points; SEARCH;
D O I
10.1145/3607873
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering multi-dimensional points is a fundamental task in many fields, and density-based clustering supports many applications because it can discover clusters of arbitrary shapes. This article addresses the problem of Density-Peaks Clustering (DPC) in Euclidean space. DPC already has many applications, but its straightforward implementation incurs O(n(2)) time, where n is the number of points, thereby does not scale to large datasets. To enable DPC on large datasets, we first propose empirically efficient exact DPC algorithm, Ex-DPC. Although this algorithm is much faster than the straightforward implementation, it still suffers from O(n(2)) time theoretically. We hence propose a new exact algorithm, Ex-DPC++, that runs in o(n(2)) time. We accelerate their efficiencies by leveraging multi-threading. Moreover, real-world datasets may have arbitrary updates (point insertions and deletions). It is hence important to support efficient cluster updates. To this end, we propose D-DPC for fully dynamic DPC. We conduct extensive experiments using real datasets, and our experimental results demonstrate that our algorithms are efficient and scalable.
引用
收藏
页数:27
相关论文
共 50 条
  • [41] Geometric algorithms for density-based data clustering
    Chen, DZ
    Smid, M
    Xu, B
    ALGORITHMS-ESA 2002, PROCEEDINGS, 2002, 2461 : 284 - 296
  • [42] Efficient Data Clustering by Local Density Approximation
    Akodjenou, Marc-Ismael
    Gallinari, Patrick
    ECAI 2008, PROCEEDINGS, 2008, 178 : 767 - 768
  • [43] Efficient density clustering method for spatial data
    Pan, F
    Wang, BY
    Zhang, Y
    Ren, DM
    Hu, X
    Perrizo, W
    KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2003, PROCEEDINGS, 2003, 2838 : 375 - 386
  • [44] Space and time efficient parallel algorithms and software for EST clustering
    Kalyanaraman, A
    Aluru, S
    Kothari, S
    2002 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, PROCEEDING, 2002, : 331 - 339
  • [45] Space and time efficient parallel algorithms and software for EST clustering
    Kalyanaraman, A
    Aluru, S
    Brendel, V
    Kothari, S
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2003, 14 (12) : 1209 - 1221
  • [46] Efficient approximation algorithms for pairwise data clustering and applications
    Wu, XD
    Chen, DZ
    Mason, JJ
    Schmid, SR
    INTERNATIONAL JOURNAL OF COMPUTATIONAL GEOMETRY & APPLICATIONS, 2004, 14 (1-2) : 85 - 104
  • [47] Design of computationally efficient density-based clustering algorithms
    Nanda, Satyasai Jagannath
    Panda, Ganapati
    DATA & KNOWLEDGE ENGINEERING, 2015, 95 : 23 - 38
  • [48] A Method of Incomplete Data Three-Way Clustering Based on Density Peaks
    Yang, Lin
    Hou, Kaiyan
    6TH INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN, MANUFACTURING, MODELING AND SIMULATION (CDMMS 2018), 2018, 1967
  • [49] A New Weight Based Density Peaks Clustering Algorithm for Numerical and Categorical Data
    Tong, Wuning
    Wang, Yuping
    Zhong, Junkun
    Yan, Wei
    2017 13TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS), 2017, : 169 - 172
  • [50] An Ensemble Learning Algorithm Based on Density Peaks Clustering and Fitness for Imbalanced Data
    Xu, Hui
    Liu, Qicheng
    IEEE ACCESS, 2022, 10 : 116120 - 116128