Sparsity-Aware Communication for Distributed Graph Neural Network Training

被引:0
|
作者
Mukhopadhyay, Ujjaini [1 ]
Tripathy, Alok [1 ]
Selvitopi, Oguz [2 ]
Yelick, Katherine [1 ]
Buluc, Aydin [2 ]
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
[2] Lawrence Berkeley Nat Lab, Berkeley, CA USA
关键词
MATRIX MULTIPLICATION;
D O I
10.1145/3673038.3673152
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Graph Neural Networks (GNNs) are a computationally efficient method to learn embeddings and classifications on graph data. However, GNN training has low computational intensity, making communication costs the bottleneck for scalability. Sparse-matrix dense-matrix multiplication (SpMM) is the core computational operation in full-graph training of GNNs. Previous work parallelizing this operation focused on sparsity-oblivious algorithms, where matrix elements are communicated regardless of the sparsity pattern. This leads to a predictable communication pattern that can be overlapped with computation and enables the use of collective communication operations at the expense of wasting significant bandwidth by communicating unnecessary data. We develop sparsity-aware algorithms that tackle the communication bottlenecks in GNN training with three novel approaches. First, we communicate only the necessary matrix elements. Second, we utilize a graph partitioning model to reorder the matrix and drastically reduce the amount of communicated elements. Finally, we address the high load imbalance in communication with a tailored partitioning model, which minimizes both the total communication volume and the maximum sending volume. We further couple these sparsity-exploiting approaches with a communication-avoiding approach (1.5D parallel SpMM) in which submatrices are replicated to reduce communication. We explore the tradeoffs of these combined optimizations and show up to 14x improvement on 256 GPUs and on some instances reducing communication to almost zero resulting in a communication-free parallel training relative to a popular GNN framework based on communication-oblivious SpMM.
引用
收藏
页码:117 / 126
页数:10
相关论文
共 50 条
  • [21] Distributed Graph Neural Network Training: A Survey
    Shao, Yingxia
    Li, Hongzheng
    Gu, Xizhi
    Yin, Hongbo
    Li, Yawen
    Miao, Xupeng
    Zhang, Wentao
    Cui, Bin
    Chen, Lei
    ACM COMPUTING SURVEYS, 2024, 56 (08)
  • [22] TSUNAMI: Triple Sparsity-Aware Ultra Energy-Efficient Neural Network Training Accelerator With Multi-Modal Iterative Pruning
    Kim, Sangyeob
    Lee, Juhyoung
    Kang, Sanghoon
    Han, Donghyeon
    Jo, Wooyoung
    Yoo, Hoi-Jun
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2022, 69 (04) : 1494 - 1506
  • [23] Sparsity-Aware Possibilistic Clustering Algorithms
    Xenaki, Spyridoula D.
    Koutroumbas, Konstantinos D.
    Rontogiannis, Athanasios A.
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2016, 24 (06) : 1611 - 1626
  • [24] Reducing Communication in Graph Neural Network Training
    Tripathy, Alok
    Yelick, Katherine
    Buluc, Aydin
    PROCEEDINGS OF SC20: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC20), 2020,
  • [25] SAVE: Sparsity-Aware Vector Engine for Accelerating DNN Training and Inference on CPUs
    Gong, Zhangxiaowen
    Ji, Houxiang
    Fletcher, Christopher W.
    Hughes, Christopher J.
    Baghsorkhi, Sara
    Torrellas, Josep
    2020 53RD ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO 2020), 2020, : 796 - 810
  • [26] Group sparsity-aware convolutional neural network for continuous missing data recovery of structural health monitoring
    Tang, Zhiyi
    Bao, Yuequan
    Li, Hui
    STRUCTURAL HEALTH MONITORING-AN INTERNATIONAL JOURNAL, 2021, 20 (04): : 1738 - 1759
  • [27] Sparsity-Aware Storage Format Selection
    Cheshmi, Kazem
    Cheshmi, Leila
    Dehnavi, Maryam Mehri
    PROCEEDINGS 2018 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2018, : 1034 - 1037
  • [28] Distributed Informative-Sensor Identification via Sparsity-Aware Matrix Decomposition
    Schizas, Ioannis D.
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2013, 61 (18) : 4610 - 4624
  • [29] Accelerating BPTT-Based SNN Training with Sparsity-Aware and Pipelined Architecture
    Fang, Chaoming
    Tian, Fengshi
    Yang, Jie
    Sawan, Mohamad
    2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
  • [30] Communication Sparsity in Distributed Spiking Neural Network Simulations to Improve Scalability
    Fernandez-Musoles, Carlos
    Coca, Daniel
    Richmond, Paul
    FRONTIERS IN NEUROINFORMATICS, 2019, 13