Parallel Clustering Based on Partitions of Local Minimal-Spanning-Trees

被引:0
|
作者
Tsui, Shiau-Rung [1 ]
Wang, Wei-Jen [1 ,3 ]
Chen, Shi-Shan [1 ]
Chen, Lee Shu-Teng [2 ]
Wang, Chilung [2 ]
机构
[1] Natl Cent Univ, Dept Comp Sci & Informat Engn, Jhongli, Taiwan
[2] Ind Technol Res Inst, Clouding Comp Ctr Mobile Applicat, Hsinchu, Taiwan
[3] Natl Cent Univ, Software Res Ctr, Taoyuan, Taiwan
关键词
clustering; parallel computing; graph-based clustering;
D O I
10.1109/PAAP.2012.25
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Many traditional clustering algorithms have the scalability problem while dealing with large data sets. One common strategy to handle the problem is to parallelize the algorithms and execute them along with the input data on high-performance computers. However, many graph-based clustering algorithms are hard to be parallelized since they need to calculate the similarity of all-pairs of all data nodes. In this paper, we propose a new parallel clustering algorithm, called the Para-CPLM (Parallel Clustering based on Partitions of Local Minimal-spanning-trees), which is based on three strategies - graph-based clustering, granular computing, and partition-and-merge. The Para-CPLM partitions the data domain into several regions for parallel execution, and then establishes a local minimal spanning tree in each region. After being established, the Para-CPLM combines those local minimal spanning trees and applies a method, namely the GBC method, to determine the best number of clusters. After the first phase of clustering, it repeatedly finds better pairs (edges) of the inter-clusters to reform the merged tree structure, such that the tree becomes closer to a global minimal spanning tree. Consequently, it uses the GBC method again to find the best number of clusters. From our experimental results, the Para-CPLM achieves significantly shorter execution time and better scalability while compared with the sequential GBC method. In addition, the clustering results are almost identical to those produced by the sequential GBC method.
引用
收藏
页码:111 / 118
页数:8
相关论文
共 50 条
  • [1] Density Peaks Clustering Based on Local Minimal Spanning Tree
    Wang, Renmin
    Zhu, Qingsheng
    IEEE ACCESS, 2019, 7 : 108438 - 108446
  • [2] MINIMAL SPANNING-TREES, FILAMENTS AND GALAXY CLUSTERING
    BARROW, JD
    BHAVSAR, SP
    SONODA, DH
    MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 1985, 216 (01) : 17 - 35
  • [3] CLUSTERING ON MANIFOLDS WITH DUAL-ROOTED MINIMAL SPANNING TREES
    Galluccio, L.
    Michel, O.
    Comon, P.
    18TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2010), 2010, : 1194 - 1198
  • [4] Equitable partitions into spanning trees in a graph
    Fekete, Zsolt
    Szabo, Jacint
    ELECTRONIC JOURNAL OF COMBINATORICS, 2011, 18 (01):
  • [5] Minimal spanning trees
    Schmerl, JH
    PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY, 2004, 132 (02) : 333 - 340
  • [6] Minimal Spanning Tree based Fuzzy Clustering
    Vathy-Fogarassy, Agnes
    Feil, Balazs
    Abonyi, Janos
    PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 8, 2005, 8 : 7 - 12
  • [7] MINIMAL SPANNING-TREES - AN EMPIRICAL-INVESTIGATION OF PARALLEL ALGORITHMS
    BARR, RS
    HELGAON, RV
    KENNINGTON, JL
    PARALLEL COMPUTING, 1989, 12 (01) : 45 - 52
  • [8] PARALLEL ALGORITHMS FOR MINIMAL SPANNING-TREES OF DIRECTED-GRAPHS
    ZHANG, YX
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 1989, 18 (03) : 205 - 221
  • [9] MINIMAL RATIO SPANNING TREES
    CHANDRASEKARAN, R
    NETWORKS, 1977, 7 (04) : 335 - 342
  • [10] Percolation and Minimal Spanning Trees
    Carol Bezuidenhout
    Geoffrey Grimmett
    Armin Löffler
    Journal of Statistical Physics, 1998, 92 : 1 - 34