Clustering with Minimum Spanning Trees: How Good Can It Be?

被引:0
|
作者
Gagolewski, Marek [1 ,2 ]
Cena, Anna [2 ]
Bartoszuk, Maciej [3 ]
Brzozowski, Lukasz [2 ]
机构
[1] Polish Acad Sci, Syst Res Inst, Ul Newelska 6, PL-01447 Warsaw, Poland
[2] Warsaw Univ Technol, Fac Math & Informat Sci, Ul Koszykowa 75, PL-00662 Warsaw, Poland
[3] QED Software, Ul Miedziana 3A, PL-00814 Warsaw, Poland
基金
澳大利亚研究理事会;
关键词
Hierarchical partitional clustering; Minimum spanning tree; MST; Cluster validity measure; Single linkage; Genie algorithm; Mutual information; GRAPH; QUANTIZATION; VALIDATION;
D O I
10.1007/s00357-024-09483-1
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Minimum spanning trees (MSTs) provide a convenient representation of datasets in numerous pattern recognition activities. Moreover, they are relatively fast to compute. In this paper, we quantify the extent to which they are meaningful in low-dimensional partitional data clustering tasks. By identifying the upper bounds for the agreement between the best (oracle) algorithm and the expert labels from a large battery of benchmark data, we discover that MST methods can be very competitive. Next, we review, study, extend, and generalise a few existing, state-of-the-art MST-based partitioning schemes. This leads to some new noteworthy approaches. Overall, the Genie and the information-theoretic methods often outperform the non-MST algorithms such as K-means, Gaussian mixtures, spectral clustering, Birch, density-based, and classical hierarchical agglomerative procedures. Nevertheless, we identify that there is still some room for improvement, and thus the development of novel algorithms is encouraged.
引用
下载
收藏
页码:90 / 112
页数:23
相关论文
共 50 条
  • [21] Parametric and kinetic minimum spanning trees
    Agarwal, PK
    Eppstein, D
    Guibas, LJ
    Henzinger, MR
    39TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 1998, : 596 - 605
  • [22] Finding minimum congestion spanning trees
    Werneck, RFF
    Setubal, JC
    da Conceiçao, AF
    ALGORITHM ENGINEERING, 1999, 1668 : 60 - 71
  • [23] Minimum spanning trees for community detection
    Wu, Jianshe
    Li, Xiaoxiao
    Jiao, Licheng
    Wang, Xiaohua
    Sun, Bo
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2013, 392 (09) : 2265 - 2277
  • [24] Planar bichromatic minimum spanning trees
    Borgelt, Magdalene G.
    van Kreveld, Marc
    Loffler, Maarten
    Luo, Jun
    Merrick, Damian
    Silveira, Rodrigo I.
    Vahedi, Mostafa
    JOURNAL OF DISCRETE ALGORITHMS, 2009, 7 (04) : 469 - 478
  • [25] Galactic Archaeology and Minimum Spanning Trees
    MacFarlane, Ben A.
    Gibson, Brad K.
    Flynn, Chris M. L.
    MULTI-OBJECT SPECTROSCOPY IN THE NEXT DECADE: BIG QUESTIONS, LARGE SURVEYS, AND WIDE FIELDS, 2016, 507 : 79 - 83
  • [26] Distributed verification of minimum spanning trees
    Korman, Amos
    Kutten, Shay
    DISTRIBUTED COMPUTING, 2007, 20 (04) : 253 - 266
  • [27] Minimum Spanning Trees in Temporal Graphs
    Huang, Silu
    Fu, Ada Wai-Chee
    Liu, Ruifeng
    SIGMOD'15: PROCEEDINGS OF THE 2015 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2015, : 419 - 430
  • [28] Minimum spanning trees and types of dissimilarities
    Leclerc, B
    EUROPEAN JOURNAL OF COMBINATORICS, 1996, 17 (2-3) : 255 - 264
  • [29] Increasing the weight of minimum spanning trees
    Frederickson, GN
    Solis-Oba, R
    JOURNAL OF ALGORITHMS, 1999, 33 (02) : 244 - 266
  • [30] A graph-theoretical clustering method based on two rounds of minimum spanning trees
    Zhong, Caiming
    Miao, Duoqian
    Wang, Ruizhi
    PATTERN RECOGNITION, 2010, 43 (03) : 752 - 766