RGraph: Effective Distributed Graph Data Processing System Based on RDMA

被引:0
|
作者
Cui P.-J. [1 ]
Yuan Y. [2 ]
Li C.-H. [1 ]
Zhang C. [1 ]
Wang G.-R. [2 ]
机构
[1] School of Computer Science and Engineering, Northeastern University, Shenyang
[2] School of Computer Science and Technology, Beijing Institute of Technology, Beijing
来源
Ruan Jian Xue Bao/Journal of Software | 2022年 / 33卷 / 03期
关键词
Distributed; Dynamic load balance; Graph processing system; High performance; RDMA; RDMA communication model;
D O I
10.13328/j.cnki.jos.006449
中图分类号
学科分类号
摘要
Graph is a significant data structure which describes the relationship between entries, and it is widely used in information science, physics, biology, environmental ecology and other scientific fields. Nowadays, with the growing magnitude of graph data, processing large-scale graph data using distributed system has become the popular, many specialized distributed systems, including Pregel, GraphX, PowerGraph, and Gemini have been proposed. However, compared with the current state-of-the-art shared-memory graph processing systems, these specialized distributed graph processing systems do not deliver satisfactory or stable performance advantages in processing real-world graph datasets. Several representative distributed graph processing systems are analyzed, and the major challenges that affect their performance are summarized. This study proposes RGraph, an effective distributed graph processing system based on RDMA. The key idea of RGraph is improving performance on top of making full use of the advantages of RDMA. For graph partition, RGraph adopts chunk-based partition to avoid destroying the native locality of the real-world graph, so as to ensure the locality-preserving vertex accesses. For workload, RGraph proposes a task migration mechanism based on RDMA one-side READ and a fine-grained task preemption method among threads to ensure the dynamic load balance for inter-node and intra-node, so that all computing resources can be fully utilized. For communication, RGraph effectively encapsulates IB verbs and implements a concurrent RDMA communication stack satisfied graph computing semantics. Compared with traditional MPI, RGraph’s communication stack can reduce the latency up to 2.1 times for servers’ communication. Finally, five real-world large-scale graph datasets and one synthetic dataset are used to evaluation RGraph on an HPC cluster with eight servers, and the experiment shows that RGraph has obvious performance advantages. Compared with Powergraph, RGraph has 10.1-16.8 times performance improvement. And compared with the existing state-of-the-art CPU- based distributed graph processing system, RGraph still has 2.89-5.12 times performance improvement. Meanwhile, RGraph can still guarantee stable performance advantage on extremely skewed power-law graph. © Copyright 2022, Institute of Software, the Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:1018 / 1042
页数:24
相关论文
共 35 条
  • [1] (2021)
  • [2] Yuan Y, Lian X, Wang G, Et al., Constrained shortest path query in a large time-dependent graph, Proc. of the VLDB Endowment, 12, 10, pp. 1058-1070, (2019)
  • [3] Qiu X, Cen W, Qian Z, Peng Y, Zhang Y, Lin X, Zhou J., Real-time constrained cycle detection in large dynamic graphs, Proc. of the VLDB Endowment, 11, 12, pp. 1876-1888, (2018)
  • [4] Bronson N, Amsden Z, Cabrera G, Et al., Tao: Facebook’s distributed data store for the social graph, Proc. of the USENIX Annual Technical Conf. Association for Computing Machinery, pp. 49-60, (2013)
  • [5] Shi J, Yao Y, Chen R, Chen H, Li F., Fast and concurrent RDF queries with RDMA-based distributed graph exploration, Proc. of the 12th USENIX Symp. on Operating Systems Design and Implementation, pp. 317-332, (2016)
  • [6] Zhang Y, Chen R, Chen H., Sub-millisecond stateful stream querying over fast-evolving linked data, Proc. of the 26th Symp. on Operating Systems Principles, pp. 614-630, (2017)
  • [7] (2021)
  • [8] Malewicz G, Austern MH, Bik AJC, Et al., Pregel: A system for large-scale graph processing, Proc. of the SIGMOD. Association for Computing Machinery, pp. 135-146, (2010)
  • [9] Xin RS, Gonzalez JE, Franklin MJ, Et al., GraphX: A resilient distributed graph system on spark, Proc. of the Graph Data Management Experiences and Systems, pp. 1-6, (2013)
  • [10] Gonzalez JE, Low Y, Gu H, Et al., PowerGraph: Distributed graph-parallel computation on natural graphs, Proc. of the 10th USENIX Symp. on Operating Systems Design and Implementation, pp. 17-30, (2012)