Clustering with Minimum Spanning Trees: How Good Can It Be?

被引:0
|
作者
Gagolewski, Marek [1 ,2 ]
Cena, Anna [2 ]
Bartoszuk, Maciej [3 ]
Brzozowski, Lukasz [2 ]
机构
[1] Polish Acad Sci, Syst Res Inst, Ul Newelska 6, PL-01447 Warsaw, Poland
[2] Warsaw Univ Technol, Fac Math & Informat Sci, Ul Koszykowa 75, PL-00662 Warsaw, Poland
[3] QED Software, Ul Miedziana 3A, PL-00814 Warsaw, Poland
基金
澳大利亚研究理事会;
关键词
Hierarchical partitional clustering; Minimum spanning tree; MST; Cluster validity measure; Single linkage; Genie algorithm; Mutual information; GRAPH; QUANTIZATION; VALIDATION;
D O I
10.1007/s00357-024-09483-1
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Minimum spanning trees (MSTs) provide a convenient representation of datasets in numerous pattern recognition activities. Moreover, they are relatively fast to compute. In this paper, we quantify the extent to which they are meaningful in low-dimensional partitional data clustering tasks. By identifying the upper bounds for the agreement between the best (oracle) algorithm and the expert labels from a large battery of benchmark data, we discover that MST methods can be very competitive. Next, we review, study, extend, and generalise a few existing, state-of-the-art MST-based partitioning schemes. This leads to some new noteworthy approaches. Overall, the Genie and the information-theoretic methods often outperform the non-MST algorithms such as K-means, Gaussian mixtures, spectral clustering, Birch, density-based, and classical hierarchical agglomerative procedures. Nevertheless, we identify that there is still some room for improvement, and thus the development of novel algorithms is encouraged.
引用
下载
收藏
页码:90 / 112
页数:23
相关论文
共 50 条
  • [1] CLUSTERING WITH MINIMUM SPANNING TREES
    Zhou, Yan
    Grygorash, Oleksandr
    Hain, Thomas F.
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2011, 20 (01) : 139 - 177
  • [2] Hierarchical clustering in minimum spanning trees
    Yu, Meichen
    Hillebrand, Arjan
    Tewarie, Prejaas
    Meier, Jil
    van Dijk, Bob
    Van Mieghem, Piet
    Stam, Cornelis Jan
    CHAOS, 2015, 25 (02)
  • [3] An improved clustering algorithm for minimum spanning trees in multidimensional data
    Xie, Zhi-Qiang
    Yu, Liang
    Yang, Jing
    Harbin Gongcheng Daxue Xuebao/Journal of Harbin Engineering University, 2008, 29 (08): : 851 - 857
  • [4] On generalized minimum spanning trees
    Feremans, C
    Labbé, M
    Laporte, G
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2001, 134 (02) : 457 - 458
  • [5] On partitioning minimum spanning trees
    Guttmann-Beck, Nili
    Hassin, Refael
    Stern, Michal
    DISCRETE APPLIED MATHEMATICS, 2024, 359 : 45 - 54
  • [6] The minimum labeling spanning trees
    Chang, RS
    Leu, SJ
    INFORMATION PROCESSING LETTERS, 1997, 63 (05) : 277 - 282
  • [7] Successive minimum spanning trees
    Janson, Svante
    Sorkin, Gregory B.
    RANDOM STRUCTURES & ALGORITHMS, 2022, 61 (01) : 126 - 172
  • [8] The saga of minimum spanning trees
    Mares, Martin
    COMPUTER SCIENCE REVIEW, 2008, 2 (03) : 165 - 221
  • [9] An Improved Algorithm for Clustering Gene Expression Data Using Minimum Spanning Trees
    Zhao, Weili
    Zhang, Zhiguo
    APPLIED MECHANICS AND MECHANICAL ENGINEERING, PTS 1-3, 2010, 29-32 : 2656 - +
  • [10] Double-Valued Neutrosophic Sets, their Minimum Spanning Trees, and Clustering Algorithm
    Kandasamy, Ilanthenral
    JOURNAL OF INTELLIGENT SYSTEMS, 2018, 27 (02) : 163 - 182