Shuhai: Benchmarking High Bandwidth Memory on FPGAs

被引:45
|
作者
Wang, Zeke [1 ]
Huang, Hongjing [1 ]
Zhang, Jie [1 ]
Alonso, Gustavo [2 ]
机构
[1] Zhejiang Univ, Collaborat Innovat Ctr Artificial Intelligence, Hangzhou, Peoples R China
[2] Swiss Fed Inst Technol, Syst Grp, Zurich, Switzerland
基金
中国国家自然科学基金;
关键词
D O I
10.1109/FCCM48280.2020.00024
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
FPGAs are starting to be enhanced with High Bandwidth Memory (HBM) as a way to reduce the memory bandwidth bottleneck encountered in some applications and to give the FPGA more capacity to deal with application state. However, the performance characteristics of HBM are still not well specified, especially in the context of FPGAs. In this paper, we bridge the gap between nominal specifications and actual performance by benchmarking HBM on a state-of-the-art FPGA, i.e., a Xilinx Alveo U280 featuring a two-stack HBM subsystem. To this end, we propose Shuhai, a benchmarking tool that allows us to demystify all the underlying details of HBM on an FPGA. FPGA-based benchmarking should also provide a more accurate picture of HBM than doing so on CPUs/GPUs, since CPUs/GPUs are noisier systems due to their complex control logic and cache hierarchy. Since the memory itself is complex, leveraging custom hardware logic to benchmark inside an FPGA provides more details as well as accurate and deterministic measurements. We observe that 1) HBM is able to provide up to 425 GB/s memory bandwidth, and 2) how HBM is used has a significant impact on performance, which in turn demonstrates the importance of unveiling the performance characteristics of HBM so as to select the best approach. Shuhai can be easily generalized to other FPGA boards or other generations of memory, e.g., HBM3, and DDR3. We will make Shuhai open-source, benefiting the community.
引用
收藏
页码:111 / 119
页数:9
相关论文
共 50 条
  • [1] Shuhai: A Tool for Benchmarking High Bandwidth Memory on FPGAs
    Huang, Hongjing
    Wang, Zeke
    Zhang, Jie
    He, Zhenhao
    Wu, Chao
    Xiao, Jun
    Alonso, Gustavo
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2022, 71 (05) : 1133 - 1144
  • [2] High Bandwidth Memory on FPGAs: A Data Analytics Perspective
    Kara, Kaan
    Hagleitner, Christoph
    Diamantopoulos, Dionysios
    Syrivelis, Dimitris
    Alonso, Gustavo
    [J]. 2020 30TH INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2020, : 1 - 8
  • [3] Quantifying effective memory bandwidth of platform FPGAs
    Schmidt, Andrew G.
    Sass, Ron
    [J]. FCCM 2007: 15TH ANNUAL IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, PROCEEDINGS, 2007, : 337 - +
  • [4] Accelerating Next Generation Genome Sequencing Leveraging High Bandwidth Memory on FPGAs
    Lehmann, Till
    Wenzel, Lukas
    Plauth, Max
    Koehler, Sven
    Polze, Andreas
    [J]. 2022 TENTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING WORKSHOPS, CANDARW, 2022, : 62 - 68
  • [5] GORDON: Benchmarking Optane DC Persistent Memory Modules on FPGAs
    Zhang, Jialiang
    Beckwith, Nicholas
    Li, Jing
    [J]. 2021 IEEE 29TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2021), 2021, : 97 - 105
  • [6] Enhancing Memory Bandwidth in a Single Stream Computation with Multiple FPGAs
    Mondigo, Antoniette
    Sano, Kentaro
    Takizawa, Hiroyuki
    [J]. 2018 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT 2018), 2018, : 381 - 383
  • [7] A Scalable Emulator for Quantum Fourier Transform Using Multiple-FPGAs with High-Bandwidth-Memory
    Waidyasooriya, Hasitha Muthumala
    Oshiyama, Hiroki
    Kurebayashi, Yuya
    Hariyama, Masanori
    Ohzeki, Masayuki
    [J]. IEEE ACCESS, 2022, 10 : 65103 - 65117
  • [8] A Scalable High-Bandwidth Architecture for Lossless Compression on FPGAs
    Fowers, Jeremy
    Kim, Joo-Young
    Burger, Doug
    Hauck, Scott
    [J]. 2015 IEEE 23RD ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2015, : 52 - 59
  • [9] The Era of High Bandwidth Memory
    Tran, Kevin
    [J]. 2016 IEEE HOT CHIPS 28 SYMPOSIUM (HCS), 2016,
  • [10] HCMA: Supporting High Concurrency of Memory Accesses with Scratchpad Memory in FPGAs
    Zhao, Yangyang
    Liu, Yuhang
    Li, Wei
    Chen, Mingyu
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON NETWORKING, ARCHITECTURE AND STORAGE (NAS), 2019, : 33 - 40