Optimized Implementation of the HPCG Benchmark on Reconfigurable Hardware

被引:5
|
作者
Zeni, Alberto [1 ,2 ]
O'Brien, Kenneth [1 ]
Blott, Michaela [1 ]
Santambrogio, Marco D. [2 ]
机构
[1] Xilinx Inc, Res Labs, Dublin, Ireland
[2] Politecn Milan, Milan, Italy
来源
关键词
Reconfigurable architectures; High performance computing; Benchmark testing; HIGH-PERFORMANCE;
D O I
10.1007/978-3-030-85665-6_38
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The HPCG benchmark represents a modern complement to the HPL benchmark in the performance evaluation of HPC systems, as it has been recognized as a more representative benchmark to reflect real-world applications. While typical workloads become more and more challenging, the semiconductor industry is battling with performance scaling and power efficiency on next-generation technology nodes. As a result, the industry is turning towards more customized compute architectures to help meet the latest performance requirements. In this paper, we present the details of the first FPGA-based implementation of HPCG that takes advantage of such customized compute architectures. Our results show that our high-performance multi-FPGA implementation, using 1 and 4 Xilinx Alveo U280 achieves up to 108.3 GFlops and 346.5 GFlops respectively, representing speed-ups of 104.1x and 333.2x over software running on a server with an Intel Xeon processor with no loss of accuracy. We also demonstrate that the FPGA-based solution achieves comparable performance with respect to modern GPUs and an up to 2.7x improvement in terms of power efficiency compared to an NVIDIA Tesla V100. Finally, a theoretical evaluation, based on Berkeley's Roofline model demonstrates that our implementation is near optimally tuned on the Xilinx Alveo U280.
引用
收藏
页码:616 / 630
页数:15
相关论文
共 50 条
  • [41] Reconfigurable hardware implementation of a phase-correlation stereoalgorithm
    Darabiha, A
    MacLean, WJ
    Rose, J
    MACHINE VISION AND APPLICATIONS, 2006, 17 (02) : 116 - 132
  • [42] Hardware Implementation of Reconfigurable 1D Convolution
    Lei Rao
    Bin Zhang
    Jizhong Zhao
    Journal of Signal Processing Systems, 2016, 82 : 1 - 16
  • [43] An Efficient Implementation Of A Phase Unwrapping Kernel On Reconfigurable Hardware
    Braganza, Sherman
    Leeser, Miriam
    PROCEEDINGS OF THE SIXTEENTH IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, 2008, : 316 - 317
  • [44] Optimized software-hardware communications for shared memory reconfigurable computer
    Xun, Changqing
    Yang, Qianming
    Wu, Nan
    Wen, Mei
    Zhang, Chunyuan
    Xun, C. (xunchangqing@nudt.edu.cn), 1637, Science Press (50): : 1637 - 1646
  • [45] An Optimized Method of Hardware Implementation for LHash in the Embedded System
    Wang, Xiang
    Tian, Yuntong
    Du, Pei
    Zhang, Xiaobing
    Wang, Weike
    Hao, Qiang
    Xu, Bing
    Zhang, Zhun
    Zhao, Zongmin
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND APPLICATION ENGINEERING (CSAE2018), 2018,
  • [46] An Optimized MD5 Algorithm and Hardware Implementation
    Wang Z.
    Li N.
    Hunan Daxue Xuebao/Journal of Hunan University Natural Sciences, 2022, 49 (02): : 106 - 110
  • [47] Strategy to Design Formally Verified Hardware/Software Implementation of Network Protocols on Reconfigurable Hardware
    Abeyrathne, Pabudi T.
    Dewasurendra, S. D.
    Elkaduwa, Dhammika
    2015 IEEE 10TH INTERNATIONAL CONFERENCE ON INDUSTRIAL AND INFORMATION SYSTEMS (ICIIS), 2015, : 25 - 30
  • [48] Parallel optimized method and hardware implementation of SURF algorithm
    Opto-Electronic Information Technology Department, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang
    110016, China
    不详
    110016, China
    不详
    Liaoning Province
    110016, China
    不详
    100049, China
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao, 2 (256-263):
  • [49] Memory Optimized Hardware Implementation of Open FEC Encoder
    Zokaei, Abolfazl
    Truhachev, Dmitri
    El-Sankary, Kamal
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2022, 30 (10) : 1548 - 1552
  • [50] FIR Filter Design Methodology for Hardware Optimized Implementation
    Mehboob, Rizwana
    Khan, Shoab A.
    Qamar, Rabia
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2009, 55 (03) : 1669 - 1673