FSGraph: fast and scalable implementation of graph traversal on GPUs

被引:1
|
作者
Zhang, Yuan [1 ,2 ]
Cao, Huawei [1 ,3 ]
Liang, Yan [1 ,2 ]
Zhang, Jie [1 ,2 ]
Huang, Junying [1 ]
Ye, Xiaochun [1 ]
An, Xuejun [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[3] Univ Chinese Acad Sci, Nanjing 211135, Peoples R China
基金
北京市自然科学基金;
关键词
BFS; GPU-friendly CSR structure; Bidirectional 1d partition; UM-aware communication; ALGORITHMS;
D O I
10.1007/s42514-023-00155-x
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Graph is one of the best ways to express and process association relationship. It is widely used in various applications, including social networks, fraud detection, Internet of things, etc. As a typical graph traversal algorithm, the Breadth-First Search (BFS) performance on GPU is not desirable, due to strong data dependency, intensive irregular memory access and low computation intensity. On GPUs, the situation is even worse with unbalanced data partitioning and high communicationto-computation ratios. In this paper, we implement FSGraph that is a fast and scalable BFS implementation on GPUs. In FSGraph, we propose three optimizing techniques: GPU-friendly Compressed Sparse Row (CSR) structure, bidirectional one-dimensional (1d) partition and UM-aware communication. We have evaluated our work with extensive experiments on four T4 and four V100 GPUs. The average performance of BFS on four T4 GPUs is 132.67 Giga-Traversed Edges per Second (GTEPS), which delivers up to 1.44x improvement than that on single T4. In terms of four V100 GPUs, the BFS performance achieves 392.35 GTEPS and outperforms existing CPU-based cluster with 1024 nodes on November 2022 Graph500 list.
引用
收藏
页码:277 / 291
页数:15
相关论文
共 50 条
  • [11] Specialization or Generalization: A Study on Breadth-First Graph Traversal on GPUs
    Zhong, Wenyong
    Cao, Yanxin
    Li, Jiawen
    Sun, Jianhua
    Chen, Hao
    PROCEEDINGS OF 2017 IEEE INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATICS AND COMPUTING (PIC 2017), 2017, : 294 - 301
  • [12] Scalable SIMD-Efficient Graph Processing on GPUs
    Khorasani, Farzad
    Gupta, Rajiv
    Bhuyan, Laxmi N.
    2015 INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURE AND COMPILATION (PACT), 2015, : 39 - 50
  • [13] Scalable and efficient graph traversal on high-throughput cluster
    Fan, Dongrui
    Cao, Huawei
    Wang, Guobo
    Nie, Na
    Ye, Xiaochun
    Sun, Ninghui
    CCF TRANSACTIONS ON HIGH PERFORMANCE COMPUTING, 2021, 3 (01) : 101 - 113
  • [14] Scalable and efficient graph traversal on high-throughput cluster
    Dongrui Fan
    Huawei Cao
    Guobo Wang
    Na Nie
    Xiaochun Ye
    Ninghui Sun
    CCF Transactions on High Performance Computing, 2021, 3 : 101 - 113
  • [15] Scalable Graph Traversal on Sunway TaihuLight with Ten Million Cores
    Lin, Heng
    Tang, Xiongchao
    Yu, Bowen
    Zhuo, Youwei
    Chen, Wenguang
    Zhai, Jidong
    Yin, Wanwang
    Zheng, Weimin
    2017 31ST IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2017, : 635 - 645
  • [16] Graph traversal dynamic demonstration system design and implementation
    Chen Bangze
    Yang Xiaobo
    RESOURCES AND SUSTAINABLE DEVELOPMENT, PTS 1-4, 2013, 734-737 : 2959 - +
  • [17] Scalable and Fast Characteristic Mode Analysis using GPUs
    Alsultan, Khulud
    Hamdalla, Mohamed Z. M.
    Dey, Sumitra
    Rao, Praveen
    Hassan, Ahmed M.
    APPLIED COMPUTATIONAL ELECTROMAGNETICS SOCIETY JOURNAL, 2022, 37 (02): : 156 - 167
  • [18] Scalable and Performant Graph Processing on GPUs Using Approximate Computing
    Singh, Somesh
    Nasre, Rupesh
    IEEE TRANSACTIONS ON MULTI-SCALE COMPUTING SYSTEMS, 2018, 4 (03): : 190 - 203
  • [19] NUMA-aware Scalable Graph Traversal on SGI UV Systems
    Yasui, Yuichiro
    Fujisawa, Katsuki
    Goh, Eng Lim
    Baron, John
    Sugiura, Atsushi
    Uchiyama, Takashi
    PROCEEDINGS OF THE ACM WORKSHOP ON HIGH PERFORMANCE GRAPH PROCESSING (HPGP'16), 2016, : 19 - 26
  • [20] EMOGI: Efficient Memory-access for Out-of-memory Graph-traversal in GPUs
    Min, Seung Won
    Mailthody, Vikram Sharma
    Qureshi, Zaid
    Xiong, Jinjun
    Ebrahimi, Eiman
    Hwu, Wen-mei
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 14 (02): : 114 - 127