Optimized Implementation of the HPCG Benchmark on Reconfigurable Hardware

被引:5
|
作者
Zeni, Alberto [1 ,2 ]
O'Brien, Kenneth [1 ]
Blott, Michaela [1 ]
Santambrogio, Marco D. [2 ]
机构
[1] Xilinx Inc, Res Labs, Dublin, Ireland
[2] Politecn Milan, Milan, Italy
来源
关键词
Reconfigurable architectures; High performance computing; Benchmark testing; HIGH-PERFORMANCE;
D O I
10.1007/978-3-030-85665-6_38
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The HPCG benchmark represents a modern complement to the HPL benchmark in the performance evaluation of HPC systems, as it has been recognized as a more representative benchmark to reflect real-world applications. While typical workloads become more and more challenging, the semiconductor industry is battling with performance scaling and power efficiency on next-generation technology nodes. As a result, the industry is turning towards more customized compute architectures to help meet the latest performance requirements. In this paper, we present the details of the first FPGA-based implementation of HPCG that takes advantage of such customized compute architectures. Our results show that our high-performance multi-FPGA implementation, using 1 and 4 Xilinx Alveo U280 achieves up to 108.3 GFlops and 346.5 GFlops respectively, representing speed-ups of 104.1x and 333.2x over software running on a server with an Intel Xeon processor with no loss of accuracy. We also demonstrate that the FPGA-based solution achieves comparable performance with respect to modern GPUs and an up to 2.7x improvement in terms of power efficiency compared to an NVIDIA Tesla V100. Finally, a theoretical evaluation, based on Berkeley's Roofline model demonstrates that our implementation is near optimally tuned on the Xilinx Alveo U280.
引用
收藏
页码:616 / 630
页数:15
相关论文
共 50 条
  • [1] OpenSHMEM Implementation of HPCG Benchmark
    D'Azevedo, Eduardo
    Powers, Sarah
    Imam, Neena
    OPENSHMEM AND RELATED TECHNOLOGIES: ENHANCING OPENSHMEM FOR HYBRID ENVIRONMENTS, 2016, 10007
  • [2] An optimized reconfigurable architecture for hardware implementation of decimal arithmetic
    Emami, Samaneh
    Sedighi, Mehdi
    COMPUTERS & ELECTRICAL ENGINEERING, 2017, 63 : 18 - 29
  • [3] Performance Modeling of the HPCG Benchmark
    Marjanovic, Vladimir
    Gracia, Jose
    Glass, Colin W.
    HIGH PERFORMANCE COMPUTING SYSTEMS: PERFORMANCE MODELING, BENCHMARKING, AND SIMULATION, 2015, 8966 : 172 - 192
  • [4] A benchmark approach for compilers in reconfigurable hardware
    Lopes, Joelmir Jose
    Silva, Jorge Luiz e
    Marques, Eduardo
    Cardoso, Joao M. P.
    6TH INTERNATIONAL WORKSHOP ON SYSTEM-ON-CHIP FOR REAL-TIME APPLICATIONS, PROCEEDINGS, 2006, : 120 - +
  • [5] XTR implementation on reconfigurable hardware
    Peeters, E
    Neve, M
    Ciet, M
    CRYPTOGRAPHIC HARDWARE AND EMBEDDED SYSTEMS - CHES 2004, PROCEEDINGS, 2004, 3156 : 386 - 399
  • [6] Reconfigurable hardware implementation of BinDCT
    Murphy, CW
    Harvey, DM
    ELECTRONICS LETTERS, 2002, 38 (18) : 1012 - 1013
  • [7] Optimized Programmable Hardware Scheduler for Reconfigurable MPSoCs
    Lalley, P. M.
    Latha, T.
    2016 2ND INTERNATIONAL CONFERENCE ON GREEN HIGH PERFORMANCE COMPUTING (ICGHPC), 2016,
  • [8] Implementation of Sorting Algorithms in Reconfigurable Hardware
    Skliarova, Iouliia
    Sklyarov, Valery
    Mihhailov, Dmitri
    Sudnitson, Alexander
    2012 16TH IEEE MEDITERRANEAN ELECTROTECHNICAL CONFERENCE (MELECON), 2012, : 107 - 110
  • [9] Hardware Implementation of Reconfigurable Separable Convolution
    Rao, Lei
    Zhang, Bin
    Zhao, Jizhong
    2018 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI), 2018, : 232 - 237
  • [10] Neuron implementation using reconfigurable hardware
    Sofron, Emil
    Serban, Gheorghe
    Bostan, Ionel
    Ionescu, Laurentiu
    Ionescu, Valeriu
    Mazare, Alin
    PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON OPTIMIZATION OF ELECTRICAL AND ELECTRONIC EQUIPMENT, VOL III: INDUSTRIAL AUTOMATION AND CONTROL, 2004, : 189 - 192