Multithreaded sparse matrix-matrix multiplication for many-core and GPU architectures

被引：34

作者：

Deveci, Mehmet ^{[1
]}

Trott, Christian ^{[1
]}

Rajamanickam, Sivasankaran ^{[1
]}

机构：

[1] Sandia Natl Labs, POB 5800, Albuquerque, NM 87185 USA

来源：

PARALLEL COMPUTING | 2018年 / 78卷

关键词：

Sparse matrix sparse matrix multiplication; KNLs; GPUs; SpGEMM;

D O I：

10.1016/j.parco.2018.06.009

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Sparse matrix-matrix multiplication is a key kernel that has applications in several domains such as scientific computing and graph analysis. Several algorithms have been studied in the past for this foundational kernel. In this paper, we develop parallel algorithms for sparse matrix-matrix multiplication with a focus on performance portability across different high performance computing architectures. The performance of these algorithms depend on the data structures used in them. We compare different types of accumulators in these algorithms and demonstrate the performance difference between these data structures. Furthermore, we develop a meta-algorithm, KKSPGEMM, to choose the right algorithm and data structure based on the characteristics of the problem. We show performance comparisons on three architectures and demonstrate the need for the community to develop two phase sparse matrix-matrix multiplication implementations for efficient reuse of the data structures involved. (C) 2018 Elsevier B.V. All rights reserved.

引用

页码：33 / 46

页数：14

共 50 条

[1] Exploiting Locality in Sparse Matrix-Matrix Multiplication on Many-Core Architectures
Akbudak, Kadir
Aykanat, Cevdet
[J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2017, 28 (08) : 2258 - 2271
[2] Performance-Portable Sparse Matrix-Matrix Multiplication for Many-Core Architectures
Deveci, Mehmet
Trott, Christian
Rajamanickam, Sivasankaran
[J]. 2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2017, : 693 - 702
[3] MEMORY-EFFICIENT SPARSE MATRIX-MATRIX MULTIPLICATION BY ROW MERGING ON MANY-CORE ARCHITECTURES
Gremse, Felix
Kuepper, Kerstin
Naumann, Uwe
[J]. SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2018, 40 (04): : C429 - C449
[4] Optimizing Sparse Matrix-Matrix Multiplication for the GPU
Dalton, Steven
Olson, Luke
Bell, Nathan
[J]. ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2015, 41 (04):
[5] Adaptive Sparse Matrix-Matrix Multiplication on the GPU
Winter, Martin
Mlakar, Daniel
Zayer, Rhaleb
Seidel, Hans-Peter
Steinberger, Markus
[J]. PROCEEDINGS OF THE 24TH SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING (PPOPP '19), 2019, : 68 - 81
[6] Sparse Matrix-Matrix Multiplication on Modern Architectures
Matam, Kiran
Indarapu, Siva Rama Krishna Bharadwaj
Kothapalli, Kishore
[J]. 2012 19TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2012,
[7] Sparse Matrix Multiplication on a Reconfigurable Many-Core Architecture
Pinhao, Joao
Jose, Wilson
Neto, Horacio
Vestias, Mario
[J]. 2015 EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD), 2015, : 330 - 336
[8] Adaptive Optimization of Sparse Matrix-Vector Multiplication on Emerging Many-Core Architectures
Chen, Shizhao
Fang, Jianbin
Chen, Donglin
Xu, Chuanfu
Wang, Zheng
[J]. IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2018, : 649 - 658
[9] Scale-Free Sparse Matrix-Vector Multiplication on Many-Core Architectures
Liang, Yun
Tang, Wai Teng
Zhao, Ruizhe
Lu, Mian
Huynh Phung Huynh
Goh, Rick Siow Mong
[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2017, 36 (12) : 2106 - 2119
[10] Accelerating sparse matrix-matrix multiplication with GPU Tensor Cores
Zachariadis, Orestis
Satpute, Nitin
Gomez-Luna, Juan
Olivares, Joaquin
[J]. COMPUTERS & ELECTRICAL ENGINEERING, 2020, 88 (88)

← 1 2 3 4 5 →