Towards Memory and Computation Efficient Graph Processing on Spark

被引:0
|
作者
Tian, Xinhui [1 ,2 ]
Guo, Yuanqing [1 ]
Zhan, Jianfeng [1 ,2 ]
Wang, Lei [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, State Key Lab Comp Architecture, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
关键词
distributed system; Spark; big graph; STRATEGIES; CLOUD;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Algorithms for large scale natural graph processing can be categorized into two types based on their value propagation behaviors: the unidirectional value propagation (UVP) algorithms and the bidirectional value propagation (BVP) algorithms. The behavior about how vertices interact with neighbors also differs between two algorithm types, which demands different system design choices. However, current distributed graph processing systems usually try to support both types in one general-purpose framework Such system design can not promise good performance and low resource consumption for both types. Especially, for UVP algorithms, current systems can not guarantee low memory footprint, computation efficiency and communication efficiency at the same time. In this paper, we propose a new graph processing engine on Spark, GraphV, which is specially designed for the unidirectional value propagation algorithms, and can satisfy all the above requirements for this type of algorithms. To retain the generalization for other algorithms, we also build a dual-engine framework by integrating GraphV with Spark's existing graph processing engine GraphX. The main design choices of GraphV include a cheap propagation-related partitioner, an one-step computation model, and a locality-aware local graph layout. According to the experiment results, GraphV is faster than GraphX by the factors of 1.2x - 31x, with much less resource consumption. The source code of GraphV will be publicly available from http://prof.ict.ac.cn/GraphV.
引用
收藏
页码:375 / 382
页数:8
相关论文
共 50 条
  • [1] Towards Online Graph Processing with Spark Streaming
    Abughofa, Tariq
    Zulkernine, Farhana
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 2787 - 2794
  • [2] Towards efficient allocation of graph convolutional networks on hybrid computation-in-memory architecture
    Jiaxian CHEN
    Guanquan LIN
    Jiexin CHEN
    Yi WANG
    [J]. Science China(Information Sciences), 2021, 64 (06) : 112 - 125
  • [3] Towards efficient allocation of graph convolutional networks on hybrid computation-in-memory architecture
    Chen, Jiaxian
    Lin, Guanquan
    Chen, Jiexin
    Wang, Yi
    [J]. SCIENCE CHINA-INFORMATION SCIENCES, 2021, 64 (06)
  • [4] Towards efficient allocation of graph convolutional networks on hybrid computation-in-memory architecture
    Jiaxian Chen
    Guanquan Lin
    Jiexin Chen
    Yi Wang
    [J]. Science China Information Sciences, 2021, 64
  • [5] GPU in-memory processing using Spark for iterative computation
    Hong, Sumin
    Choi, Woohyuk
    Jeong, Won-Ki
    [J]. 2017 17TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2017, : 31 - 41
  • [6] Congra: Towards Efficient Processing of Concurrent Graph Queries on Shared-Memory Machines
    Pan, Peitian
    Li, Chao
    [J]. 2017 IEEE 35TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD), 2017, : 217 - 224
  • [7] Seraph: Towards Scalable and Efficient Fully-external Graph Computation via On-demand Processing
    Yang, Tsun-Yu
    Chen, Yizou
    Liang, Yuhong
    Yang, Ming-Chang
    [J]. PROCEEDINGS OF THE 22ND USENIX CONFERENCE ON FILE AND STORAGE TECHNOLOGIES, FAST 24, 2024, : 373 - 387
  • [8] Seraph: Towards Scalable and Efficient Fully-external Graph Computation via On-demand Processing
    Yang, Tsun-Yu
    Chen, Yizou
    Liang, Yuhong
    Yang, Ming-Chang
    [J]. PROCEEDINGS OF THE 21ST USENIX SYMPOSIUM ON NETWORKED SYSTEMS DESIGN AND IMPLEMENTATION, NSDI 24, 2024, : 373 - 387
  • [9] GraphF: An Efficient Fine-Grained Graph Processing System on Spark
    Zhang, Chengfei
    Zhang, Yiming
    Li, Ziyang
    Zhao, Yunxiang
    Li, Dongsheng
    [J]. 2016 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE & COMPUTATIONAL INTELLIGENCE (CSCI), 2016, : 1406 - 1407
  • [10] SilverChunk: An Efficient In-Memory Parallel Graph Processing System
    Zheng, Tianqi
    Zhang, Zhibin
    Cheng, Xueqi
    [J]. DATABASE AND EXPERT SYSTEMS APPLICATIONS, PT II, 2019, 11707 : 222 - 236