A Fast and More Accurate Seed-and-Extension Density-Based Clustering Algorithm

被引:0
|
作者
Tung, Ming-Hao [1 ]
Chen, Yi-Ping Phoebe [2 ]
Liu, Chen-Yu [3 ]
Liao, Chung-Shou [4 ]
机构
[1] Micron Technol Inc, Res & Dev, Hsinchu, Taiwan
[2] La Trobe Univ, Dept Comp Sci & Informat Technol, Melbourne, Australia
[3] Natl Tsing Hua Univ, Dept Ind Engn & Engn Management, Hsinchu, Taiwan
[4] Natl Tsing Hua Univ, Ind Engn & Engn Management, Hsinchu, Taiwan
关键词
Clustering algorithms; Heuristic algorithms; Partitioning algorithms; Forestry; Machine learning algorithms; Shape; Numerical models; Center selection; density peaks; seed-and-extension; spanning tree; clustering;
D O I
10.1109/TKDE.2022.3161117
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering algorithms have been widely studied in many scientific areas, such as data mining, knowledge discovery, bioinformatics and machine learning. A density-based clustering algorithm, called density peaks (DP), which was proposed by Rodriguez and Laio, outperforms almost all other approaches. Although the DP algorithm performs well in many cases, there is still room for improvement in the precision of its output clusters as well as the quality of the selected centers. In this study, we propose a more accurate clustering algorithm, seed-and-extension-based density peaks (SDP). SDP selects the centers that hold the features of their clusters while building a spanning forest, and meanwhile, constructs the output clusters in a seed-and-extension manner. Experiment results demonstrate the effectiveness of SDP, especially when dealing with clusters with relatively high densities. Precisely, we show that SDP is more accurate than the DP algorithm as well as other state-of-the-art clustering approaches concerning the quality of both output clusters and cluster centers while maintaining similar running time of the DP algorithm, particularly for a variety of time-series data. Moreover, SDP outperforms DP in the dynamic model in which data point insertion and deletion are allowed. From a practical perspective, the proposed SDP algorithm is obviously helpful to many application problems.
引用
收藏
页码:5458 / 5471
页数:14
相关论文
共 50 条
  • [31] DENDIS: A new density-based sampling for clustering algorithm
    Ros, Frederic
    Guillaume, Serge
    EXPERT SYSTEMS WITH APPLICATIONS, 2016, 56 : 349 - 359
  • [32] A modified density-based clustering algorithm and its implementation
    Ban, Zhihua
    Liu, Jianguo
    Yuan, Lulu
    Yang, Hua
    MIPPR 2015: PATTERN RECOGNITION AND COMPUTER VISION, 2015, 9813
  • [33] Density-based clustering
    Campello, Ricardo J. G. B.
    Kroeger, Peer
    Sander, Jorg
    Zimek, Arthur
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 10 (02)
  • [34] Density-based clustering
    Kriegel, Hans-Peter
    Kroeger, Peer
    Sander, Joerg
    Zimek, Arthur
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2011, 1 (03) : 231 - 240
  • [35] Combined Density-based and Constraint-based Algorithm for Clustering
    陈同孝
    陈荣昌
    林志强
    邱永兴
    Journal of DongHua University, 2006, (06) : 36 - 38
  • [36] A new density-based scheme for clustering based on genetic algorithm
    Lin, CY
    Chang, CC
    FUNDAMENTA INFORMATICAE, 2005, 68 (04) : 315 - 331
  • [37] Fast Parameterless Density-Based Clustering via Random Projections
    Schneider, Johannes
    Vlachos, Michail
    PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 861 - 866
  • [38] Fast Density-Based Clustering Using Graphics Processing Units
    Loh, Woong-Kee
    Moon, Yang-Sae
    Park, Young-Ho
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (05): : 1349 - 1352
  • [39] A clustering algorithm based on density kernel extension
    Dai, Wei-Di
    He, Pi-Lian
    Hou, Yue-Xian
    Kang, Xiao-Dong
    ADVANCES IN MACHINE LEARNING AND CYBERNETICS, 2006, 3930 : 189 - 198
  • [40] HGADC: Hierarchical Genetic Algorithm with Density-Based Clustering for TSP
    Song, Zhenghan
    Li, Yunyi
    Wang, Wenjun
    BIO-INSPIRED COMPUTING: THEORIES AND APPLICATIONS, PT 1, BIC-TA 2023, 2024, 2061 : 262 - 275