Exploring the Design Space of Distributed Parallel Sparse Matrix-Multiple Vector Multiplication

被引：0

作者：

Huang, Hua ^{[1
]}

Chow, Edmond ^{[1
]}

机构：

[1] Georgia Inst Technol, Sch Computat Sci, Engn, Atlanta, GA 30332 USA

来源：

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS | 2024年 / 35卷 / 11期

关键词：

Sparse matrices; Partitioning algorithms; Vectors; Costs; Three-dimensional displays; Space exploration; Optimization; SpMM; SpMV; distributed-memory matrix multiplication; communication optimization; OPTIMIZATION; PERFORMANCE; FRAMEWORK;

D O I：

10.1109/TPDS.2024.3452478

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

We consider the distributed memory parallel multiplication of a sparse matrix by a dense matrix (SpMM). The dense matrix is often a collection of dense vectors. Standard implementations will multiply the sparse matrix by multiple dense vectors at the same time, to exploit the computational efficiencies therein. But such approaches generally utilize the same sparse matrix partitioning as if multiplying by a single vector. This article explores the design space of parallelizing SpMM and shows that a coarser grain partitioning of the matrix combined with a column-wise partitioning of the block of vectors can often require less communication volume and achieve higher SpMM performance. An algorithm is presented that chooses a process grid geometry for a given number of processes to optimize the performance of parallel SpMM. The algorithm can augment existing graph partitioners by utilizing the additional concurrency available when multiplying by multiple dense vectors to further reduce communication.

引用

页码：1977 / 1988

页数：12

共 50 条

[41] Optimization of Block Sparse Matrix-Vector Multiplication on Shared-Memory Parallel Architectures
Eberhardt, Ryan
Hoemmen, Mark
2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2016, : 663 - 672
[42] A two-dimensional data distribution method for parallel sparse matrix-vector multiplication
Vastenhouw, B
Bisseling, RH
SIAM REVIEW, 2005, 47 (01) : 67 - 95
[43] A Novel Multi-GPU Parallel Optimization Model for The Sparse Matrix-Vector Multiplication
Gao, Jiaquan
Zhou, Yuanshen
Wu, Kesong
PARALLEL PROCESSING LETTERS, 2016, 26 (04)
[44] A Novel Parallel Scan for Multicore Processors and Its Application in Sparse Matrix-Vector Multiplication
Zhang, Nan
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2012, 23 (03) : 397 - 404
[45] Vectorized Parallel Sparse Matrix-Vector Multiplication in PETSc Using AVX-512
Zhang, Hong
Mills, Richard T.
Rupp, Karl
Smith, Barry F.
PROCEEDINGS OF THE 47TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, 2018,
[46] Exploring Better Speculation and Data Locality in Sparse Matrix-Vector Multiplication on Intel Xeon
Zhao, Haoran
Xia, Tian
Li, Chenyang
Zhao, Wenzhe
Zheng, Nanning
Ren, Pengju
2020 IEEE 38TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2020), 2020, : 601 - 609
[47] DATA DISTRIBUTIONS FOR SPARSE-MATRIX VECTOR MULTIPLICATION
ROMERO, LF
ZAPATA, EL
PARALLEL COMPUTING, 1995, 21 (04) : 583 - 605
[48] SPARSE-MATRIX VECTOR MULTIPLICATION ON DISTRIBUTED ARCHITECTURES - LOWER BOUNDS AND AVERAGE COMPLEXITY RESULTS
MANZINI, G
INFORMATION PROCESSING LETTERS, 1994, 50 (05) : 231 - 238
[49] Understanding the performance of sparse matrix-vector multiplication
Goumas, Georgios
Kourtis, Kornilios
Anastopoulos, Nikos
Karakasis, Vasileios
Koziris, Nectarios
PROCEEDINGS OF THE 16TH EUROMICRO CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING, 2008, : 283 - +
[50] Sparse Matrix-Vector Multiplication on a Reconfigurable Supercomputer
DuBois, David
DuBois, Andrew
Connor, Carolyn
Poole, Steve
PROCEEDINGS OF THE SIXTEENTH IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, 2008, : 239 - +

← 1 2 3 4 5 →