Parallel Clustering Based on Partitions of Local Minimal-Spanning-Trees

被引:0
|
作者
Tsui, Shiau-Rung [1 ]
Wang, Wei-Jen [1 ,3 ]
Chen, Shi-Shan [1 ]
Chen, Lee Shu-Teng [2 ]
Wang, Chilung [2 ]
机构
[1] Natl Cent Univ, Dept Comp Sci & Informat Engn, Jhongli, Taiwan
[2] Ind Technol Res Inst, Clouding Comp Ctr Mobile Applicat, Hsinchu, Taiwan
[3] Natl Cent Univ, Software Res Ctr, Taoyuan, Taiwan
关键词
clustering; parallel computing; graph-based clustering;
D O I
10.1109/PAAP.2012.25
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Many traditional clustering algorithms have the scalability problem while dealing with large data sets. One common strategy to handle the problem is to parallelize the algorithms and execute them along with the input data on high-performance computers. However, many graph-based clustering algorithms are hard to be parallelized since they need to calculate the similarity of all-pairs of all data nodes. In this paper, we propose a new parallel clustering algorithm, called the Para-CPLM (Parallel Clustering based on Partitions of Local Minimal-spanning-trees), which is based on three strategies - graph-based clustering, granular computing, and partition-and-merge. The Para-CPLM partitions the data domain into several regions for parallel execution, and then establishes a local minimal spanning tree in each region. After being established, the Para-CPLM combines those local minimal spanning trees and applies a method, namely the GBC method, to determine the best number of clusters. After the first phase of clustering, it repeatedly finds better pairs (edges) of the inter-clusters to reform the merged tree structure, such that the tree becomes closer to a global minimal spanning tree. Consequently, it uses the GBC method again to find the best number of clusters. From our experimental results, the Para-CPLM achieves significantly shorter execution time and better scalability while compared with the sequential GBC method. In addition, the clustering results are almost identical to those produced by the sequential GBC method.
引用
收藏
页码:111 / 118
页数:8
相关论文
共 50 条
  • [21] STORAGE REDUCTION THROUGH MINIMAL SPANNING TREES AND SPANNING FORESTS
    KANG, ANC
    LEE, RCT
    CHANG, CL
    CHANG, SK
    IEEE TRANSACTIONS ON COMPUTERS, 1977, 26 (05) : 425 - 434
  • [22] MINIMAL SPANNING TREE CLUSTERING METHOD
    MAGNUSKI, HS
    COMMUNICATIONS OF THE ACM, 1975, 18 (02) : 119 - 119
  • [23] APPLICATION OF MINIMAL SPANNING-TREES IN GLIOMA GRADING - A CLIPPER PROGRAM FOR THE CALCULATION AND CONSTRUCTION OF MINIMAL SPANNING-TREES
    KOLLES, H
    LUDT, H
    VINCE, GH
    FEIDEN, W
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 1994, 42 (03) : 201 - 206
  • [24] Minimal graphs with a prescribed number of spanning trees
    Stong, Richard
    AUSTRALASIAN JOURNAL OF COMBINATORICS, 2022, 82 : 182 - 196
  • [25] Uniform and minimal essential spanning forests on trees
    Haggstrom, O
    RANDOM STRUCTURES & ALGORITHMS, 1998, 12 (01) : 27 - 50
  • [26] DETERMINATION OF MINIMAL PATHS AND SPANNING TREES IN GRAPHS
    JUNG, HA
    COMPUTING, 1974, 13 (3-4) : 249 - 252
  • [27] Minimal spanning trees with a constraint on the number of leaves
    Fernandes, LM
    Gouveia, L
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 1998, 104 (01) : 250 - 261
  • [28] The use of minimal spanning trees in particle physics
    Rainbolt, J. Lovelace
    Schmitt, M.
    JOURNAL OF INSTRUMENTATION, 2017, 12
  • [29] Cluster stability using minimal spanning trees
    Barzily, Zeev
    Volkovich, Zeev
    Akteke-Oeztuerk, Basak
    Weber, Gerhard-Wilhelm
    20TH INTERNATIONAL CONFERENCE, EURO MINI CONFERENCE CONTINUOUS OPTIMIZATION AND KNOWLEDGE-BASED TECHNOLOGIES, EUROPT'2008, 2008, : 248 - +
  • [30] MINIMAL SPANNING TREES AND STEIN'S METHOD
    Chatterjee, Sourav
    Sen, Sanchayan
    ANNALS OF APPLIED PROBABILITY, 2017, 27 (03): : 1588 - 1645