Hypergraph partitioning for sparse matrix-matrix multiplication

被引：2

作者：

Ballard G. ^{[1
]}

Druinsky A. ^{[2
]}

Knight N. ^{[3
]}

Schwartz O. ^{[4
]}

机构：

[1] Department of Computer Science, Wake Forest University, PO Box 7311, Winston-Salem, 27109, NC

[2] Computational Research Division, Lawrence Berkeley National Laboratory, MS 50F-1650, 1 Cyclotron Rd., Berkeley, 94720, CA

[3] Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, 10012, NY

[4] Benin School of Computer Science and Engineering, Hebrew University of Jerusalem, Jerusalem

来源：

ACM Transactions on Parallel Computing | 2016年 / 3卷 / 03期

基金：

以色列科学基金会;

关键词：

Hypergraph partitioning; Sparse matrix-matrix multiplication;

D O I：

10.1145/3015144

中图分类号：

学科分类号：

摘要：

We propose a fine-grained hypergraph model for sparse matrix-matrix multiplication (SpGEMM), a key computational kernel in scientific computing and data analysis whose performance is often communication bound. This model correctly describes both the interprocessor communication volume along a critical path in a parallel computation and also the volume of data moving through the memory hierarchy in a sequential computation. We show that identifying a communication-optimal algorithm for particular input matrices is equivalent to solving a hypergraph partitioning problem. Our approach is nonzero structure dependent, meaning that we seek the best algorithm for the given input matrices. In addition to our three-dimensional fine-grained model, we also propose coarse-grained one-dimensional and two-dimensional models that correspond to simpler SpGEMM algorithms. We explore the relations between our models theoretically, and we study their performance experimentally in the context of three applications that use SpGEMM as a key computation. For each application, we find that at least one coarse-grained model is as communication efficient as the fine-grained model. We also observe that different applications have affinities for different algorithms. Our results demonstrate that hypergraphs are an accurate model for reasoning about the communication costs of SpGEMM as well as a practical tool for exploring the SpGEMM algorithm design space. © 2016 ACM.

引用

页码：1 / 34

页数：33

共 50 条

[41] JITSPMM: Just-in-Time Instruction Generation for Accelerated Sparse Matrix-Matrix Multiplication
Fu, Qiang
Rolinger, Thomas B.
Huang, H. Howie
2024 IEEE/ACM INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION, CGO, 2024, : 448 - 459
[42] Performance-Portable Sparse Matrix-Matrix Multiplication for Many-Core Architectures
Deveci, Mehmet
Trott, Christian
Rajamanickam, Sivasankaran
2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2017, : 693 - 702
[43] An Accelerator for Sparse Convolutional Neural Networks Leveraging Systolic General Matrix-matrix Multiplication
Soltaniyeh, Mohammadreza
Martin, Richard P.
Nagarakatte, Santosh
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2022, 19 (03)
[44] Reducing inter-process communication overhead in parallel sparse matrix-matrix multiplication
Ahmed M.S.
Houser J.
Hoque M.A.
Raju R.
Pfeiffer P.
Int. J. Grid High Perform. Comput., 3 (46-59): : 46 - 59
[45] SpFlow: Memory-driven Data Flow Optimization for Sparse Matrix-Matrix Multiplication
Nie, Qi
Malik, Sharad
2019 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2019,
[46] Accelerating Sparse General Matrix-Matrix Multiplication for NVIDIA Volta GPU and Hygon DCU
Tian, Zhuo
Yang, Shuai
Zhang, Changyou
PROCEEDINGS OF THE 32ND INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE PARALLEL AND DISTRIBUTED COMPUTING, HPDC 2023, 2023, : 329 - 330
[47] Bandwidth Optimized Parallel Algorithms for Sparse Matrix-Matrix Multiplication using Propagation Blocking
Gu, Zhixiang
Moreira, Jose
Edelsohn, David
Azad, Ariful
PROCEEDINGS OF THE 32ND ACM SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES (SPAA '20), 2020, : 293 - 303
[48] Performance-Aware Model for Sparse Matrix-Matrix Multiplication on the Sunway TaihuLight Supercomputer
Chen, Yuedan
Li, Kenli
Yang, Wangdong
Xiao, Guoqing
Xie, Xianghui
Li, Tao
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (04) : 923 - 938
[49] Parallel sparse matrix-matrix multiplication: a scalable solution with 1D algorithm
Hoque, Mohammad Asadul
Raju, Md Rezaul Karim
Tymczak, Christopher John
Vrinceanu, Daniel
Chilakamarri, Kiran
INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2015, 11 (04) : 391 - 401
[50] Learning from Optimizing Matrix-Matrix Multiplication
Parikh, Devangi N.
Huang, Jianyu
Myers, Margaret E.
van de Geijn, Robert A.
2018 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2018), 2018, : 332 - 339

← 1 2 3 4 5 →