Towards Memory and Computation Efficient Graph Processing on Spark

被引：0

作者：

Tian, Xinhui ^{[1
,2
]}

Guo, Yuanqing ^{[1
]}

Zhan, Jianfeng ^{[1
,2
]}

Wang, Lei ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, Inst Comp Technol, State Key Lab Comp Architecture, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Beijing, Peoples R China

来源：

2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | 2017年

关键词：

distributed system; Spark; big graph; STRATEGIES; CLOUD;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Algorithms for large scale natural graph processing can be categorized into two types based on their value propagation behaviors: the unidirectional value propagation (UVP) algorithms and the bidirectional value propagation (BVP) algorithms. The behavior about how vertices interact with neighbors also differs between two algorithm types, which demands different system design choices. However, current distributed graph processing systems usually try to support both types in one general-purpose framework Such system design can not promise good performance and low resource consumption for both types. Especially, for UVP algorithms, current systems can not guarantee low memory footprint, computation efficiency and communication efficiency at the same time. In this paper, we propose a new graph processing engine on Spark, GraphV, which is specially designed for the unidirectional value propagation algorithms, and can satisfy all the above requirements for this type of algorithms. To retain the generalization for other algorithms, we also build a dual-engine framework by integrating GraphV with Spark's existing graph processing engine GraphX. The main design choices of GraphV include a cheap propagation-related partitioner, an one-step computation model, and a locality-aware local graph layout. According to the experiment results, GraphV is faster than GraphX by the factors of 1.2x - 31x, with much less resource consumption. The source code of GraphV will be publicly available from http://prof.ict.ac.cn/GraphV.

引用

页码：375 / 382

页数：8

共 50 条

[1] Towards Online Graph Processing with Spark Streaming
Abughofa, Tariq
Zulkernine, Farhana
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 2787 - 2794
[2] Towards efficient allocation of graph convolutional networks on hybrid computation-in-memory architecture
Jiaxian CHEN
Guanquan LIN
Jiexin CHEN
Yi WANG
[J]. Science China(Information Sciences), 2021, 64 (06) : 112 - 125
[3] Towards efficient allocation of graph convolutional networks on hybrid computation-in-memory architecture
Chen, Jiaxian
Lin, Guanquan
Chen, Jiexin
Wang, Yi
[J]. SCIENCE CHINA-INFORMATION SCIENCES, 2021, 64 (06)
[4] Towards efficient allocation of graph convolutional networks on hybrid computation-in-memory architecture
Jiaxian Chen
Guanquan Lin
Jiexin Chen
Yi Wang
[J]. Science China Information Sciences, 2021, 64
[5] GPU in-memory processing using Spark for iterative computation
Hong, Sumin
Choi, Woohyuk
Jeong, Won-Ki
[J]. 2017 17TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2017, : 31 - 41
[6] Congra: Towards Efficient Processing of Concurrent Graph Queries on Shared-Memory Machines
Pan, Peitian
Li, Chao
[J]. 2017 IEEE 35TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD), 2017, : 217 - 224
[7] Seraph: Towards Scalable and Efficient Fully-external Graph Computation via On-demand Processing
Yang, Tsun-Yu
Chen, Yizou
Liang, Yuhong
Yang, Ming-Chang
[J]. PROCEEDINGS OF THE 22ND USENIX CONFERENCE ON FILE AND STORAGE TECHNOLOGIES, FAST 24, 2024, : 373 - 387
[8] Seraph: Towards Scalable and Efficient Fully-external Graph Computation via On-demand Processing
Yang, Tsun-Yu
Chen, Yizou
Liang, Yuhong
Yang, Ming-Chang
[J]. PROCEEDINGS OF THE 21ST USENIX SYMPOSIUM ON NETWORKED SYSTEMS DESIGN AND IMPLEMENTATION, NSDI 24, 2024, : 373 - 387
[9] GraphF: An Efficient Fine-Grained Graph Processing System on Spark
Zhang, Chengfei
Zhang, Yiming
Li, Ziyang
Zhao, Yunxiang
Li, Dongsheng
[J]. 2016 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE & COMPUTATIONAL INTELLIGENCE (CSCI), 2016, : 1406 - 1407
[10] SilverChunk: An Efficient In-Memory Parallel Graph Processing System
Zheng, Tianqi
Zhang, Zhibin
Cheng, Xueqi
[J]. DATABASE AND EXPERT SYSTEMS APPLICATIONS, PT II, 2019, 11707 : 222 - 236

← 1 2 3 4 5 →