Hypergraph partitioning for sparse matrix-matrix multiplication

被引：2

作者：

Ballard G. ^{[1
]}

Druinsky A. ^{[2
]}

Knight N. ^{[3
]}

Schwartz O. ^{[4
]}

机构：

[1] Department of Computer Science, Wake Forest University, PO Box 7311, Winston-Salem, 27109, NC

[2] Computational Research Division, Lawrence Berkeley National Laboratory, MS 50F-1650, 1 Cyclotron Rd., Berkeley, 94720, CA

[3] Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, 10012, NY

[4] Benin School of Computer Science and Engineering, Hebrew University of Jerusalem, Jerusalem

来源：

ACM Transactions on Parallel Computing | 2016年 / 3卷 / 03期

基金：

以色列科学基金会;

关键词：

Hypergraph partitioning; Sparse matrix-matrix multiplication;

D O I：

10.1145/3015144

中图分类号：

学科分类号：

摘要：

We propose a fine-grained hypergraph model for sparse matrix-matrix multiplication (SpGEMM), a key computational kernel in scientific computing and data analysis whose performance is often communication bound. This model correctly describes both the interprocessor communication volume along a critical path in a parallel computation and also the volume of data moving through the memory hierarchy in a sequential computation. We show that identifying a communication-optimal algorithm for particular input matrices is equivalent to solving a hypergraph partitioning problem. Our approach is nonzero structure dependent, meaning that we seek the best algorithm for the given input matrices. In addition to our three-dimensional fine-grained model, we also propose coarse-grained one-dimensional and two-dimensional models that correspond to simpler SpGEMM algorithms. We explore the relations between our models theoretically, and we study their performance experimentally in the context of three applications that use SpGEMM as a key computation. For each application, we find that at least one coarse-grained model is as communication efficient as the fine-grained model. We also observe that different applications have affinities for different algorithms. Our results demonstrate that hypergraphs are an accurate model for reasoning about the communication costs of SpGEMM as well as a practical tool for exploring the SpGEMM algorithm design space. © 2016 ACM.

引用

页码：1 / 34

页数：33

共 50 条

[31] Communication-Avoiding Parallel Sparse-Dense Matrix-Matrix Multiplication
Koanantakool, Penporn
Azad, Ariful
Buluc, Aydin
Morozov, Dmitriy
Oh, Sang-Yun
Oliker, Leonid
Yelick, Katherine
2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2016), 2016, : 842 - 853
[32] Register-based Implementation of the Sparse General Matrix-Matrix Multiplication on GPUs
Liu, Junhong
He, Xin
Liu, Weifeng
Tan, Guangming
ACM SIGPLAN NOTICES, 2018, 53 (01) : 407 - 408
[33] Exploiting Locality in Sparse Matrix-Matrix Multiplication on Many-Core Architectures
Akbudak, Kadir
Aykanat, Cevdet
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2017, 28 (08) : 2258 - 2271
[34] TileSpGEMM: A Tiled Algorithm for Parallel Sparse General Matrix-Matrix Multiplication on GPUs
Niu, Yuyao
Lu, Zhengyang
Ji, Haonan
Song, Shuhui
Jin, Zhou
Liu, Weifeng
PPOPP'22: PROCEEDINGS OF THE 27TH ACM SIGPLAN SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING, 2022, : 90 - 106
[35] GPU-ACCELERATED SPARSE MATRIX-MATRIX MULTIPLICATION BY ITERATIVE ROW MERGING
Gremse, Felix
Hoefter, Andreas
Schwen, Lars Ole
Kiessling, Fabian
Naumann, Uwe
SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2015, 37 (01): : C54 - C71
[36] spECK: Accelerating GPU Sparse Matrix-Matrix Multiplication through Lightweight Analysis
Parger, Mathias
Winter, Martin
Mlakar, Daniel
Steinberger, Markus
PROCEEDINGS OF THE 25TH ACM SIGPLAN SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING (PPOPP '20), 2020, : 362 - 375
[37] Matrix-matrix multiplication on heterogeneous platforms
Beaumont, O
Boudet, V
Rastello, F
Robert, Y
2000 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, PROCEEDINGS, 2000, : 289 - 298
[38] Efficient Sparse-Dense Matrix-Matrix Multiplication on GPUs Using the Customized Sparse Storage Format
Shi, Shaohuai
Wang, Qiang
Chu, Xiaowen
2020 IEEE 26TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2020, : 19 - 26
[39] Implementing Sparse Matrix Ordering Using Hypergraph Partitioning
Yao Lu
Yang Yi
Wang Zhenghua
Cao Wei
MECHATRONICS AND INTELLIGENT MATERIALS III, PTS 1-3, 2013, 706-708 : 1890 - +
[40] An Efficient Gustavson-Based Sparse Matrix-Matrix Multiplication Accelerator on Embedded FPGAs
Li, Shiqing
Huai, Shuo
Liu, Weichen
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (12) : 4671 - 4680

← 1 2 3 4 5 →