A Fast and More Accurate Seed-and-Extension Density-Based Clustering Algorithm

被引:0
|
作者
Tung, Ming-Hao [1 ]
Chen, Yi-Ping Phoebe [2 ]
Liu, Chen-Yu [3 ]
Liao, Chung-Shou [4 ]
机构
[1] Micron Technol Inc, Res & Dev, Hsinchu, Taiwan
[2] La Trobe Univ, Dept Comp Sci & Informat Technol, Melbourne, Australia
[3] Natl Tsing Hua Univ, Dept Ind Engn & Engn Management, Hsinchu, Taiwan
[4] Natl Tsing Hua Univ, Ind Engn & Engn Management, Hsinchu, Taiwan
关键词
Clustering algorithms; Heuristic algorithms; Partitioning algorithms; Forestry; Machine learning algorithms; Shape; Numerical models; Center selection; density peaks; seed-and-extension; spanning tree; clustering;
D O I
10.1109/TKDE.2022.3161117
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering algorithms have been widely studied in many scientific areas, such as data mining, knowledge discovery, bioinformatics and machine learning. A density-based clustering algorithm, called density peaks (DP), which was proposed by Rodriguez and Laio, outperforms almost all other approaches. Although the DP algorithm performs well in many cases, there is still room for improvement in the precision of its output clusters as well as the quality of the selected centers. In this study, we propose a more accurate clustering algorithm, seed-and-extension-based density peaks (SDP). SDP selects the centers that hold the features of their clusters while building a spanning forest, and meanwhile, constructs the output clusters in a seed-and-extension manner. Experiment results demonstrate the effectiveness of SDP, especially when dealing with clusters with relatively high densities. Precisely, we show that SDP is more accurate than the DP algorithm as well as other state-of-the-art clustering approaches concerning the quality of both output clusters and cluster centers while maintaining similar running time of the DP algorithm, particularly for a variety of time-series data. Moreover, SDP outperforms DP in the dynamic model in which data point insertion and deletion are allowed. From a practical perspective, the proposed SDP algorithm is obviously helpful to many application problems.
引用
收藏
页码:5458 / 5471
页数:14
相关论文
共 50 条
  • [41] Density-based particle swarm optimization algorithm for data clustering
    Alswaitti, Mohammed
    Albughdadi, Mohanad
    Isa, Nor Ashidi Mat
    EXPERT SYSTEMS WITH APPLICATIONS, 2018, 91 : 170 - 186
  • [42] Density-based clustering localization algorithm for wireless sensor networks
    Wang, Yong
    Hu, Liang-Liang
    Yuan, Chao-Yan
    Dianzi Keji Daxue Xuebao/Journal of the University of Electronic Science and Technology of China, 2013, 42 (03): : 406 - 409
  • [43] Efficient incremental density-based algorithm for clustering large datasets
    Bakr, Ahmad M.
    Ghanem, Nagia M.
    Ismail, Mohamed A.
    ALEXANDRIA ENGINEERING JOURNAL, 2015, 54 (04) : 1147 - 1154
  • [44] A Grid and Density-based Clustering Algorithm for Processing Data Stream
    Jia, Chen
    Tan, ChengYu
    Yong, Ai
    SECOND INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTING: WGEC 2008, PROCEEDINGS, 2008, : 517 - +
  • [45] A New Approach on Density-Based Algorithm for Clustering Dense Areas
    Perchinunno, Paola
    L'Abbate, Samuela
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS, ICCSA 2022 WORKSHOPS, PT I, 2022, 13377 : 530 - 542
  • [46] An efficient automated incremental density-based algorithm for clustering and classification
    Azhir, Elham
    Navimipour, Nima Jafari
    Hosseinzadeh, Mehdi
    Sharifi, Arash
    Darwesh, Aso
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 114 (114): : 665 - 678
  • [47] A Multi Density-based Clustering Algorithm for Data Stream with Noise
    Amini, Amineh
    Saboohi, Hadi
    Teh, Ying Wah
    2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2013, : 1105 - 1112
  • [48] HDACC: a heuristic density-based ant colony clustering algorithm
    Chen, YF
    Liu, YS
    Fattah, CA
    Yan, GW
    IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON INTELLIGENT AGENT TECHNOLOGY, PROCEEDINGS, 2004, : 397 - 400
  • [49] Boosting Density-based Clustering Algorithm by Mean Approximation on Grids
    Zhang, Zhibing
    2009 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING ( GRC 2009), 2009, : 785 - 790
  • [50] Improvement of thinking theme discovery algorithm on density-based clustering
    University of Science and Technology Beijing, Beijing, China
    不详
    Int. J. Database Theory Appl., 1 (87-94):