Hypergraph partitioning for sparse matrix-matrix multiplication

被引:2
|
作者
Ballard G. [1 ]
Druinsky A. [2 ]
Knight N. [3 ]
Schwartz O. [4 ]
机构
[1] Department of Computer Science, Wake Forest University, PO Box 7311, Winston-Salem, 27109, NC
[2] Computational Research Division, Lawrence Berkeley National Laboratory, MS 50F-1650, 1 Cyclotron Rd., Berkeley, 94720, CA
[3] Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, 10012, NY
[4] Benin School of Computer Science and Engineering, Hebrew University of Jerusalem, Jerusalem
基金
以色列科学基金会;
关键词
Hypergraph partitioning; Sparse matrix-matrix multiplication;
D O I
10.1145/3015144
中图分类号
学科分类号
摘要
We propose a fine-grained hypergraph model for sparse matrix-matrix multiplication (SpGEMM), a key computational kernel in scientific computing and data analysis whose performance is often communication bound. This model correctly describes both the interprocessor communication volume along a critical path in a parallel computation and also the volume of data moving through the memory hierarchy in a sequential computation. We show that identifying a communication-optimal algorithm for particular input matrices is equivalent to solving a hypergraph partitioning problem. Our approach is nonzero structure dependent, meaning that we seek the best algorithm for the given input matrices. In addition to our three-dimensional fine-grained model, we also propose coarse-grained one-dimensional and two-dimensional models that correspond to simpler SpGEMM algorithms. We explore the relations between our models theoretically, and we study their performance experimentally in the context of three applications that use SpGEMM as a key computation. For each application, we find that at least one coarse-grained model is as communication efficient as the fine-grained model. We also observe that different applications have affinities for different algorithms. Our results demonstrate that hypergraphs are an accurate model for reasoning about the communication costs of SpGEMM as well as a practical tool for exploring the SpGEMM algorithm design space. © 2016 ACM.
引用
收藏
页码:1 / 34
页数:33
相关论文
共 50 条
  • [41] JITSPMM: Just-in-Time Instruction Generation for Accelerated Sparse Matrix-Matrix Multiplication
    Fu, Qiang
    Rolinger, Thomas B.
    Huang, H. Howie
    2024 IEEE/ACM INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION, CGO, 2024, : 448 - 459
  • [42] Performance-Portable Sparse Matrix-Matrix Multiplication for Many-Core Architectures
    Deveci, Mehmet
    Trott, Christian
    Rajamanickam, Sivasankaran
    2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2017, : 693 - 702
  • [43] An Accelerator for Sparse Convolutional Neural Networks Leveraging Systolic General Matrix-matrix Multiplication
    Soltaniyeh, Mohammadreza
    Martin, Richard P.
    Nagarakatte, Santosh
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2022, 19 (03)
  • [44] Reducing inter-process communication overhead in parallel sparse matrix-matrix multiplication
    Ahmed M.S.
    Houser J.
    Hoque M.A.
    Raju R.
    Pfeiffer P.
    Int. J. Grid High Perform. Comput., 3 (46-59): : 46 - 59
  • [45] SpFlow: Memory-driven Data Flow Optimization for Sparse Matrix-Matrix Multiplication
    Nie, Qi
    Malik, Sharad
    2019 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2019,
  • [46] Accelerating Sparse General Matrix-Matrix Multiplication for NVIDIA Volta GPU and Hygon DCU
    Tian, Zhuo
    Yang, Shuai
    Zhang, Changyou
    PROCEEDINGS OF THE 32ND INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE PARALLEL AND DISTRIBUTED COMPUTING, HPDC 2023, 2023, : 329 - 330
  • [47] Bandwidth Optimized Parallel Algorithms for Sparse Matrix-Matrix Multiplication using Propagation Blocking
    Gu, Zhixiang
    Moreira, Jose
    Edelsohn, David
    Azad, Ariful
    PROCEEDINGS OF THE 32ND ACM SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES (SPAA '20), 2020, : 293 - 303
  • [48] Performance-Aware Model for Sparse Matrix-Matrix Multiplication on the Sunway TaihuLight Supercomputer
    Chen, Yuedan
    Li, Kenli
    Yang, Wangdong
    Xiao, Guoqing
    Xie, Xianghui
    Li, Tao
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (04) : 923 - 938
  • [49] Parallel sparse matrix-matrix multiplication: a scalable solution with 1D algorithm
    Hoque, Mohammad Asadul
    Raju, Md Rezaul Karim
    Tymczak, Christopher John
    Vrinceanu, Daniel
    Chilakamarri, Kiran
    INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2015, 11 (04) : 391 - 401
  • [50] Learning from Optimizing Matrix-Matrix Multiplication
    Parikh, Devangi N.
    Huang, Jianyu
    Myers, Margaret E.
    van de Geijn, Robert A.
    2018 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2018), 2018, : 332 - 339