Graph Partitioning for Distributed Graph Processing

被引:24
|
作者
Onizuka M. [1 ]
Fujimori T. [1 ]
Shiokawa H. [2 ]
机构
[1] Graduate School of Information Science and Technology, Osaka University, 1-5, Yamadaoka, Suita, 565-0871, Osaka
[2] Center of Computational Sciences, University of Tsukuba, 1-1-1, Tennoudai, Tsukuba, Ibaraki
关键词
Distributed processing; Graph mining; Graph partitioning;
D O I
10.1007/s41019-017-0034-4
中图分类号
学科分类号
摘要
There is a large demand for distributed engines that efficiently process large-scale graph data, such as social graph and web graph. The distributed graph engines execute analysis process after partitioning input graph data and assign them to distributed computers, so the quality of graph partitioning largely affects the communication cost and load balance among computers during the analysis process. We propose an effective graph partitioning technique that achieves low communication cost and good load balance among computers at the same time. We first generate more clusters than the number of computers by extending the modularity-based clustering, and then merge those clusters into balanced-size clusters until the number of clusters becomes the number of computers by using techniques designed for graph packing problem. We implemented our technique on top of distributed graph engine, PowerGraph, and made intensive experiments. The results show that our partitioning technique reduces the communication cost so it improves the response time of graph analysis patterns. In particular, PageRank computation is 3.2 times faster at most than HDRF, the state-of-the art of streaming-based partitioning approach. © 2017, The Author(s).
引用
收藏
页码:94 / 105
页数:11
相关论文
共 50 条
  • [1] A Distributed Graph Partitioning Algorithm for Processing Large Graphs
    Chen, Tefeng
    Li, Bo
    [J]. PROCEEDINGS 2016 IEEE SYMPOSIUM ON SERVICE-ORIENTED SYSTEM ENGINEERING SOSE 2016, 2016, : 71 - 77
  • [2] An Experimental Comparison of Partitioning Strategies in Distributed Graph Processing
    Verma, Shiv
    Leslie, Luke M.
    Shin, Yosub
    Gupta, Indranil
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2017, 10 (05): : 493 - 504
  • [3] Graph partitioning for scalable distributed graph computations
    Buluc, Aydin
    Madduri, Kamesh
    [J]. GRAPH PARTITIONING AND GRAPH CLUSTERING, 2013, 588 : 83 - +
  • [4] Distributed CSPs by graph partitioning
    Salido, Miguel A.
    Barber, Federico
    [J]. APPLIED MATHEMATICS AND COMPUTATION, 2006, 183 (01) : 491 - 498
  • [5] Partitioning-Aware Performance Modeling of Distributed Graph Processing Tasks
    Daniel Presser
    Frank Siqueira
    [J]. International Journal of Parallel Programming, 2023, 51 : 231 - 255
  • [6] Partitioning-Aware Performance Modeling of Distributed Graph Processing Tasks
    Presser, Daniel
    Siqueira, Frank
    [J]. INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2023, 51 (4-5) : 231 - 255
  • [7] A Stream Partitioning Approach to Processing Large Scale Distributed Graph Datasets
    Wang, Rui
    Chiu, Kenneth
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
  • [8] Distributed Deep Multilevel Graph Partitioning
    Sanders, Peter
    Seemaier, Daniel
    [J]. EURO-PAR 2023: PARALLEL PROCESSING, 2023, 14100 : 443 - 457
  • [9] AKIN : A Streaming Graph Partitioning Algorithm for Distributed Graph Storage Systems
    Zhang, Wei
    Chen, Yong
    Dai, Dong
    [J]. 2018 18TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2018, : 183 - 192
  • [10] IOGP: An Incremental Online Graph Partitioning Algorithm for Distributed Graph Databases
    Dai, Dong
    Zhang, Wei
    Chen, Yong
    [J]. HPDC'17: PROCEEDINGS OF THE 26TH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE PARALLEL AND DISTRIBUTED COMPUTING, 2017, : 219 - 229