FSGraph: fast and scalable implementation of graph traversal on GPUs

被引：1

作者：

Zhang, Yuan ^{[1
,2
]}

Cao, Huawei ^{[1
,3
]}

Liang, Yan ^{[1
,2
]}

Zhang, Jie ^{[1
,2
]}

Huang, Junying ^{[1
]}

Ye, Xiaochun ^{[1
]}

An, Xuejun ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China

[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China

[3] Univ Chinese Acad Sci, Nanjing 211135, Peoples R China

来源：

CCF TRANSACTIONS ON HIGH PERFORMANCE COMPUTING | 2023年 / 5卷 / 03期

基金：

北京市自然科学基金;

关键词：

BFS; GPU-friendly CSR structure; Bidirectional 1d partition; UM-aware communication; ALGORITHMS;

D O I：

10.1007/s42514-023-00155-x

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Graph is one of the best ways to express and process association relationship. It is widely used in various applications, including social networks, fraud detection, Internet of things, etc. As a typical graph traversal algorithm, the Breadth-First Search (BFS) performance on GPU is not desirable, due to strong data dependency, intensive irregular memory access and low computation intensity. On GPUs, the situation is even worse with unbalanced data partitioning and high communicationto-computation ratios. In this paper, we implement FSGraph that is a fast and scalable BFS implementation on GPUs. In FSGraph, we propose three optimizing techniques: GPU-friendly Compressed Sparse Row (CSR) structure, bidirectional one-dimensional (1d) partition and UM-aware communication. We have evaluated our work with extensive experiments on four T4 and four V100 GPUs. The average performance of BFS on four T4 GPUs is 132.67 Giga-Traversed Edges per Second (GTEPS), which delivers up to 1.44x improvement than that on single T4. In terms of four V100 GPUs, the BFS performance achieves 392.35 GTEPS and outperforms existing CPU-based cluster with 1024 nodes on November 2022 Graph500 list.

引用

页码：277 / 291

页数：15

共 50 条

[11] Specialization or Generalization: A Study on Breadth-First Graph Traversal on GPUs
Zhong, Wenyong
Cao, Yanxin
Li, Jiawen
Sun, Jianhua
Chen, Hao
PROCEEDINGS OF 2017 IEEE INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATICS AND COMPUTING (PIC 2017), 2017, : 294 - 301
[12] Scalable SIMD-Efficient Graph Processing on GPUs
Khorasani, Farzad
Gupta, Rajiv
Bhuyan, Laxmi N.
2015 INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURE AND COMPILATION (PACT), 2015, : 39 - 50
[13] Scalable and efficient graph traversal on high-throughput cluster
Fan, Dongrui
Cao, Huawei
Wang, Guobo
Nie, Na
Ye, Xiaochun
Sun, Ninghui
CCF TRANSACTIONS ON HIGH PERFORMANCE COMPUTING, 2021, 3 (01) : 101 - 113
[14] Scalable and efficient graph traversal on high-throughput cluster
Dongrui Fan
Huawei Cao
Guobo Wang
Na Nie
Xiaochun Ye
Ninghui Sun
CCF Transactions on High Performance Computing, 2021, 3 : 101 - 113
[15] Scalable Graph Traversal on Sunway TaihuLight with Ten Million Cores
Lin, Heng
Tang, Xiongchao
Yu, Bowen
Zhuo, Youwei
Chen, Wenguang
Zhai, Jidong
Yin, Wanwang
Zheng, Weimin
2017 31ST IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2017, : 635 - 645
[16] Graph traversal dynamic demonstration system design and implementation
Chen Bangze
Yang Xiaobo
RESOURCES AND SUSTAINABLE DEVELOPMENT, PTS 1-4, 2013, 734-737 : 2959 - +
[17] Scalable and Fast Characteristic Mode Analysis using GPUs
Alsultan, Khulud
Hamdalla, Mohamed Z. M.
Dey, Sumitra
Rao, Praveen
Hassan, Ahmed M.
APPLIED COMPUTATIONAL ELECTROMAGNETICS SOCIETY JOURNAL, 2022, 37 (02): : 156 - 167
[18] Scalable and Performant Graph Processing on GPUs Using Approximate Computing
Singh, Somesh
Nasre, Rupesh
IEEE TRANSACTIONS ON MULTI-SCALE COMPUTING SYSTEMS, 2018, 4 (03): : 190 - 203
[19] NUMA-aware Scalable Graph Traversal on SGI UV Systems
Yasui, Yuichiro
Fujisawa, Katsuki
Goh, Eng Lim
Baron, John
Sugiura, Atsushi
Uchiyama, Takashi
PROCEEDINGS OF THE ACM WORKSHOP ON HIGH PERFORMANCE GRAPH PROCESSING (HPGP'16), 2016, : 19 - 26
[20] EMOGI: Efficient Memory-access for Out-of-memory Graph-traversal in GPUs
Min, Seung Won
Mailthody, Vikram Sharma
Qureshi, Zaid
Xiong, Jinjun
Ebrahimi, Eiman
Hwu, Wen-mei
PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 14 (02): : 114 - 127

← 1 2 3 4 5 →