Open-Source SpMV Multiplication Hardware Accelerator for FPGA-Based HPC Systems

被引:0
|
作者
Mpakos, Panagiotis [1 ]
Tasou, Ioanna [1 ]
Alverti, Chloe [3 ]
Miliadis, Panagiotis [1 ]
Malakonakis, Pavlos [2 ]
Theodoropoulos, Dimitris [1 ]
Goumas, Georgios [1 ]
Pnevmatikatos, Dionisios N. [1 ]
Koziris, Nectarios [1 ]
机构
[1] Natl Tech Univ Athens, Comp Syst Lab, Athens, Greece
[2] Tech Univ Crete, Khania, Greece
[3] Univ Illinois, Champaign, IL USA
基金
欧盟地平线“2020”;
关键词
Open-Source; SpMV; Sparse Matrix; HLS;
D O I
10.1007/978-3-031-55673-9_2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Sparse Matrix Vector (SpMV) multiplication kernel is a key component of many high-performance computing applications, but at the same time one of the most challenging to optimize, primarily due to its low flop-per-byte ratio and irregular memory accesses. As such, modern FPGAs, combined with High-Bandwidth Memory (HBM) modules, are much better-suited to the memory-bound nature of this kernel, compared to general purpose CPUs. Current FPGA-based approaches on SpMV support only single-precision floating point arithmetic. Moreover, they target for highly-streamed implementations that, although enhance performance, facilitate custom matrix storage formats, which (i) can increase the matrix footprint up to 3x, and (ii) drop the burden of input matrix transformation to developers. Towards widening the spectrum of FPGA-supported floating point formats for sparse algebra, this paper presents a first set of effective optimizations for double-precision SpMV hardware kernels using High-Level Synthesis (HLS) tools on HBM-equipped FPGAs. Results show that our work can provide 52.4x on average better performance compared to state-of-practice SpMV double-precision multiplication implementations on FPGAs for applications with volatile matrices, and up to 5.1x better performance-per-Watt compared to server-class CPUs.
引用
收藏
页码:19 / 32
页数:14
相关论文
共 50 条
  • [41] An FPGA-Based Hardware Accelerator for Energy-Efficient Bitmap Index Creation
    Xuan-Thuan Nguyen
    Trong-Thuc Hoang
    Hong-Thu Nguyen
    Katsumi Inoue
    Cong-Kha Pham
    IEEE ACCESS, 2018, 6 : 16046 - 16059
  • [42] Comprehensive, open-source resource usage measurement and analysis for HPC systems
    Browne, James C.
    DeLeon, Robert L.
    Patra, Abani K.
    Barth, William L.
    Hammond, John
    Jones, Matthew D.
    Furlani, Thomas R.
    Schneider, Barry I.
    Gallo, Steven M.
    Ghadersohi, Amin
    Gentner, Ryan J.
    Palmer, Jeffrey T.
    Simakov, Nikolay
    Innus, Martins
    Bruno, Andrew E.
    White, Joseph P.
    Cornelius, Cynthia D.
    Yearke, Thomas
    Marcus, Kyle
    von Laszewski, Gregor
    Wang, Fugang
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2014, 26 (13): : 2191 - 2209
  • [43] Research on open-source hardware based design method
    Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
    不详
    Dianzi Yu Xinxi Xuebao, 2007, 7 (1761-1764):
  • [44] Open-Source FPGA Implementation of Post-Quantum Cryptographic Hardware Primitives
    Agrawal, Rashmi
    Bu, Lake
    Ehret, Alan
    Kinsy, Michel
    2019 29TH INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2019, : 211 - 217
  • [45] MAPPARAT: A Resource Constrained FPGA-Based Accelerator for Sparse-Dense Matrix Multiplication
    Ashuthosh, M. R.
    Krishna, Santosh
    Sudarshan, Vishvas
    Subramaniyan, Srinivasan
    Purnaprajna, Madhura
    2022 35TH INTERNATIONAL CONFERENCE ON VLSI DESIGN (VLSID 2022) HELD CONCURRENTLY WITH 2022 21ST INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS (ES 2022), 2022, : 102 - 107
  • [46] Matrix Multiplication on FPGA-Based Platform
    Lee, Tai-Chi
    White, Mark
    Gubody, Michael
    WORLD CONGRESS ON ENGINEERING AND COMPUTER SCIENCE, WCECS 2013, VOL I, 2013, I : 113 - 117
  • [47] OPEN-SOURCE SOFTWARE AND HARDWARE PLATFORMS FOR BUILDING BACKSCATTER SYSTEMS
    Xu, Chenren
    Zhang, Pengyu
    GETMOBILE-MOBILE COMPUTING & COMMUNICATIONS REVIEW, 2019, 23 (01) : 16 - 20
  • [48] FPGA-Based HPC for Associative Memory System
    Wang, Deyu
    Wang, Yuning
    Yang, Yu
    Stathis, Dimitrios
    Hemani, Ahmed
    Lansner, Anders
    Xu, Jiawei
    Zheng, Li-Rong
    Zou, Zhuo
    29TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2024, 2024, : 52 - 57
  • [49] An FPGA-Based accelerator for multiphysics modeling
    Huang, XM
    Ma, J
    ERSA '04: THE 2004 INTERNATIONAL CONFERENCE ON ENGINEERING OF RECONFIGURABLE SYSTEMS AND ALGORITHMS, 2004, : 209 - 212
  • [50] FPGA-based hardware accelerator for high-performance data-stream processing
    Lysakov K.F.
    Shadrin M.Y.
    Pattern Recognition and Image Analysis, 2013, 23 (1) : 26 - 34