Distributed structural clustering on large graph

被引:0
|
作者
Rong, Chuitian [1 ,2 ]
Zhou, Jinyu [1 ,3 ]
机构
[1] Tiangong Univ, Sch Comp Sci & Technol, Tianjin, Peoples R China
[2] Tianjin Key Lab Autonomous Intelligence Technol &, Tianjin, Peoples R China
[3] Tiangong Univ, Sch Comp Sci & Technol, Tianjin 300387, Peoples R China
来源
关键词
distributed computing; MapReduce; structural clustering of graph; ALGORITHM;
D O I
10.1002/cpe.7756
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Graph clustering is a primitive operation for graph data mining. It plays an important role to reveal community clusters, hubs, and outliers in complex networks. There are several graph clustering algorithms have been proposed based on the well-studied SCAN algorithm in recent years. However, SCAN and its improved sequential variants are prohibitively slow due to their iterative computations. The parallel variants are focusing on improving the efficiency of graph clustering by utilizing multi-cores computer architectures on single computing node with complex optimization techniques. Therefore, SCAN and its variants are not suitable for processing very large graphs due to the limitations of memory size and storage volume on a single node. In this article, we proposed a distributed parallel structural clustering algorithm using MapReduce. In order to improve the efficiency further, we proposed optimization techniques including partition based clustering and simplified combination with labels to accelerate the operations. We conducted extensive experiments on real world datasets. The experimental results showed our algorithm is high efficiency and scales well under different settings.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Distributed Exact Structural Clustering on Large Graph
    Zhou, Jinyu
    Rong, Chuitian
    Liu, Ding
    Chai, Zhengyi
    [J]. 2022 IEEE 28TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, ICPADS, 2022, : 778 - 785
  • [2] A distributed and incremental algorithm for large-scale graph clustering
    Inoubli, Wissem
    Aridhi, Sabeur
    Mezni, Haithem
    Maddouri, Mondher
    Nguifo, Engelbert Mephu
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2022, 134 : 334 - 347
  • [3] DSCAN: Distributed Structural Graph Clustering for Billion-Edge Graphs
    Shiokawa, Hiroaki
    Takahashi, Tomokatsu
    [J]. DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2020, PT I, 2020, 12391 : 38 - 54
  • [4] SGP: A Parallel Computing Framework for Supporting Distributed Structural Graph Clustering
    Xia, Xiufeng
    Fang, Peng
    An, Yunzhe
    Zhu, Rui
    Zong, Chuanyu
    [J]. ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2021, PT III, 2022, 13157 : 722 - 736
  • [5] Distributed Graph Clustering and Sparsification
    Sun, He
    Zanetti, Luca
    [J]. ACM TRANSACTIONS ON PARALLEL COMPUTING, 2019, 6 (03)
  • [6] Distributed Graph Clustering by Load Balancing
    Sun, He
    Zanetti, Luca
    [J]. PROCEEDINGS OF THE 29TH ACM SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES (SPAA'17), 2017, : 163 - 171
  • [7] Scaling Graph Clustering with Distributed Sketches
    Priest, Benjamin W.
    Dunton, Alec
    Sanders, Geoffrey
    [J]. 2020 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2020,
  • [8] Manipulating Structural Graph Clustering
    Li, Wentao
    Gao, Min
    Wen, Dong
    Zhou, Hongwei
    Ke, Cai
    Qin, Lu
    [J]. 2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, : 2749 - 2761
  • [9] Parallel Structural Graph Clustering
    Seeland, Madeleine
    Berger, Simon A.
    Stamatakis, Alexandros
    Kramer, Stefan
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT III, 2011, 6913 : 256 - 272
  • [10] Large Graph Clustering Using DCT-Based Graph Clustering
    Tsapanos, Nikolaos
    Tefas, Anastasios
    Nikolaidis, Nikolaos
    Pitas, Ioannis
    [J]. 2014 IEEE Symposium on Computational Intelligence in Big Data (CIBD), 2014, : 108 - 111