Cerberus: Triple Mode Acceleration of Sparse Matrix and Vector Multiplication

被引：1

作者：

Hwang, Soojin ^{[1
]}

Baek, Daehyeon ^{[1
]}

Park, Jongse ^{[1
]}

Huh, Jaehyuk ^{[1
]}

机构：

[1] Korea Adv Inst Sci & Technol, Sch Comp, 291 Daehak Ro, Daejeon 34141, South Korea

来源：

ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION | 2024年 / 21卷 / 02期

关键词：

Sparse Matrix-Vector Multiplication (SpMV); accelerator;

D O I：

10.1145/3653020

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

The multiplication of sparse matrix and vector (SpMV) is one of the most widely used kernels in high-performance computing as well as machine learning acceleration for sparse neural networks. The design space of SpMV accelerators has two axes: algorithm and matrix representation. There have been two widely used algorithms and data representations. Two algorithms, scalar multiplication and dot product, can be combined with two sparse data representations, compressed sparse and bitmap formats for the matrix and vector. Although the prior accelerators adopted one of the possible designs, it is yet to be investigated which design is the best one across different hardware resources and workload characteristics. This paper first investigates the impact of design choices with respect to the algorithm and data representation. Our evaluation shows that no single design always outperforms the others across different workloads, but the two best designs (i.e., compressed sparse format and bitmap format with dot product) have complementary performance with trade-offs incurred by the matrix characteristics. Based on the analysis, this study proposes Cerberus, a triple-mode accelerator supporting two sparse operation modes in addition to the base dense mode. To allow such multi-mode operation, it proposes a prediction model based on matrix characteristics under a given hardware configuration, which statically selects the best mode for a given sparse matrix with its dimension and density information. Our experimental results show that Cerberus provides 12.1x performance improvements from a dense-only accelerator, and 1.5x improvements from a fixed best SpMV design.

引用

页数：24

共 50 条

[1] Acceleration of Sparse Matrix-Vector Multiplication by Region Traversal
Simecek, I.
ACTA POLYTECHNICA, 2008, 48 (04) : 8 - 15
[2] Sparse Vector-Matrix Multiplication Acceleration in Diode-Selected Crossbars
Jao, Nicholas
Ramanathan, Akshay Krishna
Sampson, John
Narayanan, Vijaykrishnan
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2021, 29 (12) : 2186 - 2196
[3] SpDRAM: Efficient In-DRAM Acceleration of Sparse Matrix-Vector Multiplication
Kang, Jieui
Choi, Soeun
Lee, Eunjin
Sim, Jaehyeong
IEEE ACCESS, 2024, 12 : 176009 - 176021
[4] GPU accelerated sparse matrix-vector multiplication and sparse matrix-transpose vector multiplication
Tao, Yuan
Deng, Yangdong
Mu, Shuai
Zhang, Zhenzhong
Zhu, Mingfa
Xiao, Limin
Ruan, Li
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2015, 27 (14): : 3771 - 3789
[5] Sparse Matrix to Matrix Multiplication: A Representation and Architecture for Acceleration
Golnari, Pareesa Ameneh
Malik, Sharad
2019 IEEE 30TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP 2019), 2019, : 67 - 70
[6] Sparse Matrix Sparse Vector Multiplication - A Novel Approach
Shah, Monika
2015 44TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS, 2015, : 67 - 73
[7] FPGA ACCELERATION OF SPARSE MATRIX-VECTOR MULTIPLICATION BASED ON NETWORK-ON-CHIP
Jheng, H. Y.
Sun, C. C.
Ruan, S. J.
Goetze, J.
19TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2011), 2011, : 744 - 748
[8] Sparse Matrix-Vector Multiplication on GPGPUs
Filippone, Salvatore
Cardellini, Valeria
Barbieri, Davide
Fanfarillo, Alessandro
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2017, 43 (04):
[9] Parallel Computation of Sparse Matrix Vector Multiplication
Yin, Wei
He, Yu
2011 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER SCIENCE AND APPLICATION (FCSA 2011), VOL 3, 2011, : 196 - 199
[10] Sparse matrix by vector multiplication on transputer networks
Doreste, L.
Navarro, J.J.
Fernandez, A.
Proceedings of the IASTED International Symposium on Applied Informatics, 1991,

← 1 2 3 4 5 →