Clustering with Minimum Spanning Trees: How Good Can It Be?

被引:0
|
作者
Gagolewski, Marek [1 ,2 ]
Cena, Anna [2 ]
Bartoszuk, Maciej [3 ]
Brzozowski, Lukasz [2 ]
机构
[1] Polish Acad Sci, Syst Res Inst, Ul Newelska 6, PL-01447 Warsaw, Poland
[2] Warsaw Univ Technol, Fac Math & Informat Sci, Ul Koszykowa 75, PL-00662 Warsaw, Poland
[3] QED Software, Ul Miedziana 3A, PL-00814 Warsaw, Poland
基金
澳大利亚研究理事会;
关键词
Hierarchical partitional clustering; Minimum spanning tree; MST; Cluster validity measure; Single linkage; Genie algorithm; Mutual information; GRAPH; QUANTIZATION; VALIDATION;
D O I
10.1007/s00357-024-09483-1
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Minimum spanning trees (MSTs) provide a convenient representation of datasets in numerous pattern recognition activities. Moreover, they are relatively fast to compute. In this paper, we quantify the extent to which they are meaningful in low-dimensional partitional data clustering tasks. By identifying the upper bounds for the agreement between the best (oracle) algorithm and the expert labels from a large battery of benchmark data, we discover that MST methods can be very competitive. Next, we review, study, extend, and generalise a few existing, state-of-the-art MST-based partitioning schemes. This leads to some new noteworthy approaches. Overall, the Genie and the information-theoretic methods often outperform the non-MST algorithms such as K-means, Gaussian mixtures, spectral clustering, Birch, density-based, and classical hierarchical agglomerative procedures. Nevertheless, we identify that there is still some room for improvement, and thus the development of novel algorithms is encouraged.
引用
下载
收藏
页码:90 / 112
页数:23
相关论文
共 50 条
  • [31] Minimum restricted diameter spanning trees
    Hassin, R
    Levin, A
    APPROXIMATION ALGORITHMS FOR COMBINATORIAL OPTIMIZATION, PROCEEDINGS, 2002, 2462 : 175 - 184
  • [32] CUMULATIVE CONSTRUCTION OF MINIMUM SPANNING TREES
    ROGER, JH
    CARPENTE.RG
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 1971, 20 (02) : 192 - &
  • [33] On minimum edge ranking spanning trees
    Makino, K
    Uno, Y
    Ibaraki, T
    JOURNAL OF ALGORITHMS, 2001, 38 (02) : 411 - 437
  • [34] Minimum Spanning Trees with Sums of Ratios
    Christopher C. Skiscim
    Susan W. Palocsay
    Journal of Global Optimization, 2001, 19 : 103 - 120
  • [35] Distributed verification of minimum spanning trees
    Amos Korman
    Shay Kutten
    Distributed Computing, 2007, 20 : 253 - 266
  • [36] Balanced partition of minimum spanning trees
    Andersson, M
    Gudmundsson, J
    Levcopoulos, C
    Narasimhan, G
    INTERNATIONAL JOURNAL OF COMPUTATIONAL GEOMETRY & APPLICATIONS, 2003, 13 (04) : 303 - 316
  • [37] Minimum bounded degree spanning trees
    Goemans, Michel X.
    47TH ANNUAL IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 2006, : 273 - 282
  • [38] Balanced partition of minimum spanning trees
    Andersson, M
    Gudmundsson, J
    Levcopoulos, C
    Narasimhan, G
    COMPUTATIONAL SCIENCE-ICCS 2002, PT III, PROCEEDINGS, 2002, 2331 : 26 - 35
  • [39] On Sorting, Heaps, and Minimum Spanning Trees
    Gonzalo Navarro
    Rodrigo Paredes
    Algorithmica, 2010, 57 : 585 - 620
  • [40] Minimum spanning trees on random networks
    Dobrin, R
    Duxbury, PM
    PHYSICAL REVIEW LETTERS, 2001, 86 (22) : 5076 - 5079