Towards a Universal FPGA Matrix-Vector Multiplication Architecture

被引:44
|
作者
Kestur, Srinidhi [1 ]
Davis, John D. [2 ]
Chung, Eric S. [2 ]
机构
[1] Penn State Univ, Dept Comp Sci & Engn, University Pk, PA 16802 USA
[2] Microsoft Res Silicon Valley, Mountain View, CA 94043 USA
关键词
FPGA; dense matrix; sparse matrix; spMV; reconfigurable computing;
D O I
10.1109/FCCM.2012.12
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We present the design and implementation of a universal, single-bitstream library for accelerating matrix-vector multiplication using FPGAs. Our library handles multiple matrix encodings ranging from dense to multiple sparse formats. A key novelty in our approach is the introduction of a hardware-optimized sparse matrix representation called Compressed Variable-Length Bit Vector (CVBV), which reduces the storage and bandwidth requirements up to 43% (on average 25%) compared to compressed sparse row (CSR) across all the matrices from the University of Florida Sparse Matrix Collection. Our hardware incorporates a runtime-programmable decoder that performs on-the-fly decoding of various formats such as Dense, COO, CSR, DIA, and ELL. The flexibility and scalability of our design is demonstrated across two FPGA platforms: (1) the BEE3 (Virtex-5 LX155T with 16GB of DRAM) and (2) ML605 (Virtex-6 LX240T with 2GB of DRAM). For dense matrices, our approach scales to large data sets with over 1 billion elements, and achieves robust performance independent of the matrix aspect ratio. For sparse matrices, our approach using a compressed representation reduces the overall bandwidth while also achieving comparable efficiency relative to state-of-the-art approaches.
引用
收藏
页码:9 / 16
页数:8
相关论文
共 50 条
  • [21] A P system for matrix-vector multiplication
    Guo, Ping
    Wei, Li Jiao
    Chen, Hai Zhu
    Journal of Computational and Theoretical Nanoscience, 2015, 12 (11) : 4279 - 4288
  • [22] Vector ISA extension for sparse matrix-vector multiplication
    Vassiliadis, S
    Cotofana, S
    Stathis, P
    EURO-PAR'99: PARALLEL PROCESSING, 1999, 1685 : 708 - 715
  • [23] An architecture-aware technique for optimizing sparse matrix-vector multiplication on GPUs
    Maggioni, Marco
    Berger-Wolf, Tanya
    2013 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, 2013, 18 : 329 - 338
  • [24] Optimizing the Performance of the Sparse Matrix-Vector Multiplication Kernel in FPGA Guided by the Roofline Model
    Favaro, Federico
    Dufrechou, Ernesto
    Oliver, Juan P.
    Ezzatti, Pablo
    MICROMACHINES, 2023, 14 (11)
  • [25] Adaptive Wavelet Methods - Matrix-Vector Multiplication
    Cerna, Dana
    Finek, Vaclav
    INTERNATIONAL CONFERENCE OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING 2009 (ICCMSE 2009), 2012, 1504 : 832 - 836
  • [26] FAST MULTIRESOLUTION ALGORITHMS FOR MATRIX-VECTOR MULTIPLICATION
    HARTEN, A
    YADSHALOM, I
    SIAM JOURNAL ON NUMERICAL ANALYSIS, 1994, 31 (04) : 1191 - 1218
  • [27] Matrix-Vector Multiplication in Adaptive Wavelet Methods
    Cerna, Dana
    Finek, Vaclav
    APPLICATIONS OF MATHEMATICS IN ENGINEERING AND ECONOMICS (AMEE'11): PROCEEDINGS OF THE 37TH INTERNATIONAL CONFERENCE, 2011, 1410
  • [28] REALIZATION OF DIGITAL OPTICAL MATRIX-VECTOR MULTIPLICATION
    BANDYOPADHYAY, S
    DATTA, AK
    BAWA, SS
    BIRADAR, AM
    CHANDRA, S
    JOURNAL OF PHYSICS D-APPLIED PHYSICS, 1995, 28 (01) : 7 - 11
  • [29] Sparse matrix-vector multiplication design on FPGAs
    Sun, Junqing
    Peterson, Gregory
    Storaasli, Olaf
    FCCM 2007: 15TH ANNUAL IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, PROCEEDINGS, 2007, : 349 - +
  • [30] Sparse Matrix-Vector Multiplication on a Reconfigurable Supercomputer
    DuBois, David
    DuBois, Andrew
    Connor, Carolyn
    Poole, Steve
    PROCEEDINGS OF THE SIXTEENTH IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, 2008, : 239 - +