fuseGNN: Accelerating Graph Convolutional Neural Network Training on GPGPU

被引:1
|
作者
Chen, Zhaodong [1 ]
Yan, Mingyu [1 ]
Zhu, Maohua [1 ]
Deng, Lei [1 ]
Li, Guoqi [2 ]
Li, Shuangchen [3 ]
Xie, Yuan [1 ]
机构
[1] Univ Calif Santa Barbara, Santa Barbara, CA 93106 USA
[2] Tsinghua Univ, Beijing, Peoples R China
[3] Alibaba Grp, Hangzhou, Peoples R China
基金
美国国家科学基金会;
关键词
D O I
10.1145/3400302.3415610
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Graph convolutional neural networks (GNN) have achieved state-of-the-art performance on tasks like node classification. It has become a new workload family member in data-centers. GNN works on irregular graph-structured data with three distinct phases: Combination, Graph Processing, and Aggregation. While Combination phase has been well supported by sgemm kernels in cuBLAS, the other two phases are still inefficient on GPGPU due to the lack of optimized CUDA kernels. In particular, Aggregation phase introduces large volume of DRAM storage footprint and data movement, and both Aggregation and Graph Processing phases suffer from high kernel launching time. These inefficiencies not only decrease training throughput but also limit users from training GNNs on larger graphs on GPGPU. Although these problems have been partially alleviated by recent studies, their optimizations are still not sufficient. In this paper, we propose fuseGNN, an extension of PyTorch that provides highly optimized APIs and CUDA kernels for GNN. First, two different programming abstractions for Aggregation phase are utilized to handle graphs with different average degrees. Second, dedicated GPGPU kernels are developed for Aggregation and Graph Processing in both forward and backward passes, in which kernel-fusion along with other optimization strategies are applied to reduce kernel launching time and latency as well as exploit data reuse opportunities. Evaluation on multiple benchmarks shows that fuseGNN achieves up to 5.3x end-to-end speedup over state-of-the-art frameworks, and the DRAM storage footprint is reduced by several orders of magnitude on large datasets.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Accelerating Deep Convolutional Neural on GPGPU
    Zurek, Dominik
    Pietron, Marcin
    Wiatr, Kazimierz
    [J]. INTELLIGENT COMPUTING, VOL 2, 2021, 284 : 712 - 724
  • [2] Accelerating aerodynamic design optimization based on graph convolutional neural network
    Li, Tiejun
    Yan, Junjun
    Chen, Xinhai
    Wang, Zhichao
    Zhang, Qingyang
    Zhou, Enqiang
    Gong, Chunye
    Liu, Jie
    [J]. INTERNATIONAL JOURNAL OF MODERN PHYSICS C, 2024, 35 (01):
  • [3] GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training
    Cai, Tianle
    Luo, Shengjie
    Xu, Keyulu
    He, Di
    Liu, Tie-Yan
    Wang, Liwei
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [4] Accelerating convolutional neural network training using ProMoD backpropagation algorithm
    Gurhanli, Ahmet
    [J]. IET IMAGE PROCESSING, 2020, 14 (13) : 2957 - 2964
  • [5] GCNTrain: A Unified and Efficient Accelerator for Graph Convolutional Neural Network Training
    Lu, Heng
    Song, Zhuoran
    Li, Xing
    Jing, Naifeng
    Liang, Xiaoyao
    [J]. 2022 IEEE 40TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2022), 2022, : 730 - 737
  • [6] Accelerating Large-Scale Graph Neural Network Training on Crossbar Diet
    Ogbogu, Chukwufumnanya
    Arka, Aqeeb Iqbal
    Joardar, Biresh Kumar
    Doppa, Janardhan Rao
    Li, Hai
    Chakrabarty, Krishnendu
    Pande, Partha Pratim
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (11) : 3626 - 3637
  • [7] Neighborhood Convolutional Graph Neural Network
    Chen, Jinsong
    Li, Boyu
    He, Kun
    [J]. SSRN, 2023,
  • [8] Neighborhood convolutional graph neural network
    Chen, Jinsong
    Li, Boyu
    He, Kun
    [J]. KNOWLEDGE-BASED SYSTEMS, 2024, 295
  • [9] A Survey on Graph Convolutional Neural Network
    Xu B.-B.
    Cen K.-T.
    Huang J.-J.
    Shen H.-W.
    Cheng X.-Q.
    [J]. Jisuanji Xuebao/Chinese Journal of Computers, 2020, 43 (05): : 755 - 780
  • [10] A Tennis Training Action Analysis Model Based on Graph Convolutional Neural Network
    Zhang, Xinyu
    Chen, Jihua
    [J]. IEEE ACCESS, 2023, 11 : 113264 - 113271