MOSCON: Modified Outer Product based Sparse Matrix-Matrix Multiplication Accelerator with Configurable Tiles

被引：3

作者：

Noble, G. ^{[1
]}

Nalesh, S. ^{[2
]}

Kala, S. ^{[1
]}

机构：

[1] Indian Inst Informat Technol Kottayam, Dept Elect & Commun Engn, Kottayam, Kerala, India

[2] Cochin Univ Sci & Technol, Dept Elect, Cochin, Kerala, India

来源：

2023 36TH INTERNATIONAL CONFERENCE ON VLSI DESIGN AND 2023 22ND INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS, VLSID | 2023年

关键词：

Deep learning; Sparse matrix multiplication; Execution time; FPGA accelerator;

D O I：

10.1109/VLSID57277.2023.00061

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

General Sparse Matrix-Matrix Multiplication (SpGEMM) which involves product of two sparse matrices is a key operation in many deep learning algorithms. Sparse matrices consist of only a few non-zero elements which makes it inefficient to use conventional matrix multiplication algorithms. Hence, specialized architectures for sparse matrix multiplication have been proposed. Prior works in this field uses outer product based implementation and they suffer due to poor load balance in the processing elements. We propose a modified outer product based sparse matrix-matrix multiplication architecture with configurable tiles, referred as MOSCON, which can be accelerated on Field Programmable Gate Arrays (FPGA). MOSCON can perform sparse matrix multiplication of any dimensions and takes the advantages of outer product implementation along with the features of load balanced architecture. Proposed architecture has been implemented on Xilinx Kintex-7 FPGA device and gives an average performance gain of 9.21% when compared with state-of-the-art implementations.

引用

页码：264 / 269

页数：6

共 50 条

[1] OuterSPACE: An Outer Product based Sparse Matrix Multiplication Accelerator
Pal, Subhankar
Beaumont, Jonathan
Park, Dong-Hyeon
Amarnath, Aporva
Feng, Siying
Chakrabarti, Chaitali
Kim, Hun-Seok
Blaauw, David
Mudge, Trevor
Dreslinski, Ronald
2018 24TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2018, : 724 - 736
[2] SIMULTANEOUS INPUT AND OUTPUT MATRIX PARTITIONING FOR OUTER-PRODUCT-PARALLEL SPARSE MATRIX-MATRIX MULTIPLICATION
Akbudak, Kadir
Aykanat, Cevdet
SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2014, 36 (05): : C568 - C590
[3] An Efficient Gustavson-Based Sparse Matrix-Matrix Multiplication Accelerator on Embedded FPGAs
Li, Shiqing
Huai, Shuo
Liu, Weichen
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (12) : 4671 - 4680
[4] PERFORMANCE EVALUATION OF SPARSE MATRIX-MATRIX MULTIPLICATION
Jain-Mendon, Shweta
Sass, Ron
2013 23RD INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL 2013) PROCEEDINGS, 2013,
[5] Hypergraph partitioning for sparse matrix-matrix multiplication
Ballard G.
Druinsky A.
Knight N.
Schwartz O.
ACM Transactions on Parallel Computing, 2016, 3 (03) : 1 - 34
[6] Optimizing Sparse Matrix-Matrix Multiplication for the GPU
Dalton, Steven
Olson, Luke
Bell, Nathan
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2015, 41 (04):
[7] Sparse Matrix-Matrix Multiplication on Modern Architectures
Matam, Kiran
Indarapu, Siva Rama Krishna Bharadwaj
Kothapalli, Kishore
2012 19TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2012,
[8] Adaptive Sparse Matrix-Matrix Multiplication on the GPU
Winter, Martin
Mlakar, Daniel
Zayer, Rhaleb
Seidel, Hans-Peter
Steinberger, Markus
PROCEEDINGS OF THE 24TH SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING (PPOPP '19), 2019, : 68 - 81
[9] Accelerators for Sparse Matrix-Matrix Multiplication: A Review
Noble, G.
Nalesh, S.
Kala, S.
2022 IEEE 19TH INDIA COUNCIL INTERNATIONAL CONFERENCE, INDICON, 2022,
[10] An Accelerator for Sparse Convolutional Neural Networks Leveraging Systolic General Matrix-matrix Multiplication
Soltaniyeh, Mohammadreza
Martin, Richard P.
Nagarakatte, Santosh
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2022, 19 (03)

← 1 2 3 4 5 →