Towards a Universal FPGA Matrix-Vector Multiplication Architecture

被引：44

作者：

Kestur, Srinidhi ^{[1
]}

Davis, John D. ^{[2
]}

Chung, Eric S. ^{[2
]}

机构：

[1] Penn State Univ, Dept Comp Sci & Engn, University Pk, PA 16802 USA

[2] Microsoft Res Silicon Valley, Mountain View, CA 94043 USA

来源：

2012 IEEE 20TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM) | 2012年

关键词：

FPGA; dense matrix; sparse matrix; spMV; reconfigurable computing;

D O I：

10.1109/FCCM.2012.12

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

We present the design and implementation of a universal, single-bitstream library for accelerating matrix-vector multiplication using FPGAs. Our library handles multiple matrix encodings ranging from dense to multiple sparse formats. A key novelty in our approach is the introduction of a hardware-optimized sparse matrix representation called Compressed Variable-Length Bit Vector (CVBV), which reduces the storage and bandwidth requirements up to 43% (on average 25%) compared to compressed sparse row (CSR) across all the matrices from the University of Florida Sparse Matrix Collection. Our hardware incorporates a runtime-programmable decoder that performs on-the-fly decoding of various formats such as Dense, COO, CSR, DIA, and ELL. The flexibility and scalability of our design is demonstrated across two FPGA platforms: (1) the BEE3 (Virtex-5 LX155T with 16GB of DRAM) and (2) ML605 (Virtex-6 LX240T with 2GB of DRAM). For dense matrices, our approach scales to large data sets with over 1 billion elements, and achieves robust performance independent of the matrix aspect ratio. For sparse matrices, our approach using a compressed representation reduces the overall bandwidth while also achieving comparable efficiency relative to state-of-the-art approaches.

引用

页码：9 / 16

页数：8

共 50 条

[21] A P system for matrix-vector multiplication
Guo, Ping
Wei, Li Jiao
Chen, Hai Zhu
Journal of Computational and Theoretical Nanoscience, 2015, 12 (11) : 4279 - 4288
[22] Vector ISA extension for sparse matrix-vector multiplication
Vassiliadis, S
Cotofana, S
Stathis, P
EURO-PAR'99: PARALLEL PROCESSING, 1999, 1685 : 708 - 715
[23] An architecture-aware technique for optimizing sparse matrix-vector multiplication on GPUs
Maggioni, Marco
Berger-Wolf, Tanya
2013 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, 2013, 18 : 329 - 338
[24] Optimizing the Performance of the Sparse Matrix-Vector Multiplication Kernel in FPGA Guided by the Roofline Model
Favaro, Federico
Dufrechou, Ernesto
Oliver, Juan P.
Ezzatti, Pablo
MICROMACHINES, 2023, 14 (11)
[25] Adaptive Wavelet Methods - Matrix-Vector Multiplication
Cerna, Dana
Finek, Vaclav
INTERNATIONAL CONFERENCE OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING 2009 (ICCMSE 2009), 2012, 1504 : 832 - 836
[26] FAST MULTIRESOLUTION ALGORITHMS FOR MATRIX-VECTOR MULTIPLICATION
HARTEN, A
YADSHALOM, I
SIAM JOURNAL ON NUMERICAL ANALYSIS, 1994, 31 (04) : 1191 - 1218
[27] Matrix-Vector Multiplication in Adaptive Wavelet Methods
Cerna, Dana
Finek, Vaclav
APPLICATIONS OF MATHEMATICS IN ENGINEERING AND ECONOMICS (AMEE'11): PROCEEDINGS OF THE 37TH INTERNATIONAL CONFERENCE, 2011, 1410
[28] REALIZATION OF DIGITAL OPTICAL MATRIX-VECTOR MULTIPLICATION
BANDYOPADHYAY, S
DATTA, AK
BAWA, SS
BIRADAR, AM
CHANDRA, S
JOURNAL OF PHYSICS D-APPLIED PHYSICS, 1995, 28 (01) : 7 - 11
[29] Sparse matrix-vector multiplication design on FPGAs
Sun, Junqing
Peterson, Gregory
Storaasli, Olaf
FCCM 2007: 15TH ANNUAL IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, PROCEEDINGS, 2007, : 349 - +
[30] Sparse Matrix-Vector Multiplication on a Reconfigurable Supercomputer
DuBois, David
DuBois, Andrew
Connor, Carolyn
Poole, Steve
PROCEEDINGS OF THE SIXTEENTH IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, 2008, : 239 - +

← 1 2 3 4 5 →