Sparsity-Aware Communication for Distributed Graph Neural Network Training

被引:0
|
作者
Mukhopadhyay, Ujjaini [1 ]
Tripathy, Alok [1 ]
Selvitopi, Oguz [2 ]
Yelick, Katherine [1 ]
Buluc, Aydin [2 ]
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
[2] Lawrence Berkeley Nat Lab, Berkeley, CA USA
关键词
MATRIX MULTIPLICATION;
D O I
10.1145/3673038.3673152
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Graph Neural Networks (GNNs) are a computationally efficient method to learn embeddings and classifications on graph data. However, GNN training has low computational intensity, making communication costs the bottleneck for scalability. Sparse-matrix dense-matrix multiplication (SpMM) is the core computational operation in full-graph training of GNNs. Previous work parallelizing this operation focused on sparsity-oblivious algorithms, where matrix elements are communicated regardless of the sparsity pattern. This leads to a predictable communication pattern that can be overlapped with computation and enables the use of collective communication operations at the expense of wasting significant bandwidth by communicating unnecessary data. We develop sparsity-aware algorithms that tackle the communication bottlenecks in GNN training with three novel approaches. First, we communicate only the necessary matrix elements. Second, we utilize a graph partitioning model to reorder the matrix and drastically reduce the amount of communicated elements. Finally, we address the high load imbalance in communication with a tailored partitioning model, which minimizes both the total communication volume and the maximum sending volume. We further couple these sparsity-exploiting approaches with a communication-avoiding approach (1.5D parallel SpMM) in which submatrices are replicated to reduce communication. We explore the tradeoffs of these combined optimizations and show up to 14x improvement on 256 GPUs and on some instances reducing communication to almost zero resulting in a communication-free parallel training relative to a popular GNN framework based on communication-oblivious SpMM.
引用
收藏
页码:117 / 126
页数:10
相关论文
共 50 条
  • [1] Distributed Sparsity-Aware Sensor Selection
    Jamali-Rad, Hadi
    Simonetto, Andrea
    Ma, Xiaoli
    Leus, Geert
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2015, 63 (22) : 5951 - 5964
  • [2] SATA: Sparsity-Aware Training Accelerator for Spiking Neural Networks
    Yin, Ruokai
    Moitra, Abhishek
    Bhattacharjee, Abhiroop
    Kim, Youngeun
    Panda, Priyadarshini
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (06) : 1926 - 1938
  • [3] TRAINING-BASED AND BLIND ALGORITHMS FOR SPARSITY-AWARE DISTRIBUTED LEARNING
    Chouvardas, Symeon
    Mileounis, Gerasimos
    Kalouptsidis, Nicholas
    Theodoridis, Sergios
    2013 PROCEEDINGS OF THE 21ST EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2013,
  • [4] A Sparsity-Aware Convolutional Neural Network Accelerator with Flexible Parallelism
    Yuan H.-Y.
    Zeng Z.-Y.
    Cheng J.-P.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2022, 50 (08): : 1811 - 1818
  • [5] ADAPTIVE DISTRIBUTED SPARSITY-AWARE MATRIX DECOMPOSITION
    Schizas, Ioannis D.
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 4509 - 4513
  • [6] Parallax: Sparsity-aware Data Parallel Training of Deep Neural Networks
    Kim, Soojeong
    Yu, Gyeong-In
    Park, Hojin
    Cho, Sungwoo
    Jeong, Eunji
    Ha, Hyeonmin
    Lee, Sanha
    Jeong, Joo Seong
    Chun, Byung-Gon
    PROCEEDINGS OF THE FOURTEENTH EUROSYS CONFERENCE 2019 (EUROSYS '19), 2019,
  • [7] Post-Training Sparsity-Aware Quantization
    Shomron, Gil
    Gabbay, Freddy
    Kurzum, Samer
    Weiser, Uri
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [8] Sparsity-Aware Sensor Selection: Centralized and Distributed Algorithms
    Jamali-Rad, Hadi
    Simonetto, Andrea
    Leus, Geert
    IEEE SIGNAL PROCESSING LETTERS, 2014, 21 (02) : 217 - 220
  • [9] Online Distributed Sparsity-Aware Canonical Correlation Analysis
    Chen, Jia
    Schizas, Ioannis D.
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2016, 64 (03) : 688 - 703
  • [10] A Convolutional Spiking Neural Network Accelerator with the Sparsity-aware Memory and Compressed Weights
    Liu, Hanqing
    Cui, Xiaole
    Zhang, Sunrui
    Yin, Mingqi
    Jiang, Yuanyuan
    Cui, Xiaoxin
    2024 IEEE 35TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS, ASAP 2024, 2024, : 163 - 171