Open-Source SpMV Multiplication Hardware Accelerator for FPGA-Based HPC Systems

被引:0
|
作者
Mpakos, Panagiotis [1 ]
Tasou, Ioanna [1 ]
Alverti, Chloe [3 ]
Miliadis, Panagiotis [1 ]
Malakonakis, Pavlos [2 ]
Theodoropoulos, Dimitris [1 ]
Goumas, Georgios [1 ]
Pnevmatikatos, Dionisios N. [1 ]
Koziris, Nectarios [1 ]
机构
[1] Natl Tech Univ Athens, Comp Syst Lab, Athens, Greece
[2] Tech Univ Crete, Khania, Greece
[3] Univ Illinois, Champaign, IL USA
基金
欧盟地平线“2020”;
关键词
Open-Source; SpMV; Sparse Matrix; HLS;
D O I
10.1007/978-3-031-55673-9_2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Sparse Matrix Vector (SpMV) multiplication kernel is a key component of many high-performance computing applications, but at the same time one of the most challenging to optimize, primarily due to its low flop-per-byte ratio and irregular memory accesses. As such, modern FPGAs, combined with High-Bandwidth Memory (HBM) modules, are much better-suited to the memory-bound nature of this kernel, compared to general purpose CPUs. Current FPGA-based approaches on SpMV support only single-precision floating point arithmetic. Moreover, they target for highly-streamed implementations that, although enhance performance, facilitate custom matrix storage formats, which (i) can increase the matrix footprint up to 3x, and (ii) drop the burden of input matrix transformation to developers. Towards widening the spectrum of FPGA-supported floating point formats for sparse algebra, this paper presents a first set of effective optimizations for double-precision SpMV hardware kernels using High-Level Synthesis (HLS) tools on HBM-equipped FPGAs. Results show that our work can provide 52.4x on average better performance compared to state-of-practice SpMV double-precision multiplication implementations on FPGAs for applications with volatile matrices, and up to 5.1x better performance-per-Watt compared to server-class CPUs.
引用
收藏
页码:19 / 32
页数:14
相关论文
共 50 条
  • [1] An open-source FPGA-based control and data acquisition hardware platform
    Ernesto Fernandez-Rodriguez, Luis
    Rodriguez-Resendiz, Juvenal
    Agustin Martinez-Hernandez, Moises
    [J]. 2021 XVII INTERNATIONAL ENGINEERING CONGRESS (CONIIN), 2021,
  • [2] OpenNoC: An Open-Source NoC Infrastructure for FPGA-Based Hardware Acceleration
    Reddy, Kuladeep Sai
    Vipin, Kizheppatt
    [J]. IEEE EMBEDDED SYSTEMS LETTERS, 2019, 11 (04) : 123 - 126
  • [3] LegUp: An Open-Source High-Level Synthesis Tool for FPGA-Based Processor/Accelerator Systems
    Canis, Andrew
    Choi, Jongsok
    Aldham, Mark
    Zhang, Victor
    Kammoona, Ahmed
    Czajkowski, Tomasz
    Brown, Stephen D.
    Anderson, Jason H.
    [J]. ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2013, 13 (02)
  • [4] A deeply-pipelined FPGA-based SpMV accelerator with a hardware-friendly storage scheme
    Guo, Song
    Dou, Yong
    Lei, Yuanwu
    Wu, Guiming
    [J]. IEICE ELECTRONICS EXPRESS, 2015, 12 (11):
  • [5] An FPGA-based Hardware Accelerator for Iris Segmentation
    Avey, Joe
    Jones, Phillip
    Zambreno, Joseph
    [J]. 2018 INTERNATIONAL CONFERENCE ON RECONFIGURABLE COMPUTING AND FPGAS (RECONFIG), 2018,
  • [6] FPGA-Based Hardware Accelerator for Matrix Inversion
    Kokkiligadda V.S.K.
    Naikoti V.
    Patkotwar G.S.
    Sabat S.L.
    Peesapati R.
    [J]. SN Computer Science, 4 (2)
  • [7] BSTMSM: A High-Performance FPGA-based Multi-Scalar Multiplication Hardware Accelerator
    Zhao, Baoze
    Huang, Wenjin
    Li, Tianrui
    Huang, Yihua
    [J]. 2023 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE TECHNOLOGY, ICFPT, 2023, : 35 - 43
  • [8] An FPGA-based Hardware Accelerator for Simulating Spatiotemporal Neurons
    Tarawneh, Ghaith
    Read, Jenny
    [J]. 2014 21ST IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS AND SYSTEMS (ICECS), 2014, : 618 - 621
  • [9] Reconfigurable FPGA-based hardware accelerator for embedded DSP
    Rubin, G.
    Omieljanowicz, M.
    Petrovsky, A.
    [J]. MIXDES 2007: Proceedings of the 14th International Conference on Mixed Design of Integrated Circuits and Systems:, 2007, : 147 - 151
  • [10] An FPGA-Based Hardware Accelerator for Traffic Sign Detection
    Shi, Weijing
    Li, Xin
    Yu, Zhiyi
    Overett, Gary
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2017, 25 (04) : 1362 - 1372