Multi-FPGA Accelerator for Scalable Stencil Computation with Constant Memory Bandwidth

被引:75
|
作者
Sano, Kentaro [1 ]
Hatsuda, Yoshiaki [2 ]
Yamamoto, Satoru [1 ]
机构
[1] Tohoku Univ, Grad Sch Informat Sci, Sendai, Miyagi 980, Japan
[2] Kobo Co Ltd, Kawaguchi, Saitama, Japan
关键词
Scalable streaming-array; stencil computation; custom computing machine; FPGA; high-performance computation; MODEL;
D O I
10.1109/TPDS.2013.51
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Stencil computation is one of the important kernels in scientific computations. However, sustained performance is limited owing to restriction on memory bandwidth, especially on multicore microprocessors and graphics processing units (GPUs) because of their small operational intensity. In this paper, we present a custom computing machine (CCM), called a scalable streaming-array (SSA), for high-performance stencil computations with multiple field-programmable gate arrays (FPGAs). We design SSA based on a domain-specific programmable concept, where CCMs are programmable with the minimum functionality required for an algorithm domain. We employ a deep pipelining approach over successive iterations to achieve linear scalability for multiple devices with a constant memory bandwidth. Prototype implementation using nine FPGAs demonstrates good agreement with a performance model, and achieves 260 and 236 GFlop/s for 2D and 3D Jacobi computation, which are 87.4 and 83.9 percent of the peak, respectively, with a memory bandwidth of only 2.0 GB/s. We also evaluate the performance of SSA for state-of-the-art FPGAs.
引用
收藏
页码:695 / 705
页数:11
相关论文
共 38 条
  • [11] Scalable multi-FPGA platform for networks-on-chip emulation
    Kouadri-Mostefaoui, Abdellah-Medjadji
    Senouci, Benaoumeur
    Petrot, Frederic
    2007 IEEE INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES, AND PROCESSORS, 2007, : 54 - 60
  • [12] Scalable Streaming-Array of Simple Soft-Processors for Stencil Computations with Constant Memory-Bandwidth
    Sano, Kentaro
    Hatsuda, Yoshiaki
    Yamamoto, Satoru
    2011 IEEE 19TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2011, : 234 - 241
  • [13] SMAPPIC: Scalable Multi-FPGA Architecture Prototype Platform in the Cloud
    Chirkov, Grigory
    Wentzlaff, David
    PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS, VOL 2, ASPLOS 2023, 2023, : 733 - 746
  • [14] A Scalable Multi-FPGA Platform for Hybrid Intelligent Optimization Algorithms
    Zhao, Yu
    Zhao, Chun
    Zhao, Liangtian
    ELECTRONICS, 2024, 13 (17)
  • [15] Enhancing the Scalability of Multi-FPGA Stencil Computations via Highly Optimized HDL Components
    Reggiani, Enrico
    Del Sozzo, Emanuele
    Conficconi, Davide
    Natale, Giuseppe
    Moroni, Carlo
    Santambrogio, Marco D.
    ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2021, 14 (03)
  • [16] Scalable Multi-FPGA Acceleration for Large RNNs with Full Parallelism Levels
    Kwon, Dongup
    Hur, Suyeon
    Jang, Hamin
    Nurvitadhi, Eriko
    Kim, Jangwoo
    PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2020,
  • [17] NARMADA: Near-memory horizontal diffusion accelerator for scalable stencil computations
    Singh, Gagandeep
    Diamantopoulos, Dionysios
    Hagleitner, Christoph
    Stuijk, Sander
    Corporaal, Henk
    2019 29TH INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2019, : 263 - 269
  • [18] NERO: A Near High-Bandwidth Memory Stencil Accelerator forWeather Prediction Modeling
    Singh, Gagandeep
    Diamantopoulos, Dionysios
    Hagleitner, Christoph
    Gomez-Luna, Juan
    Stuijk, Sander
    Mutlu, Onur
    Corporaal, Henk
    2020 30TH INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2020, : 9 - 17
  • [19] High-Level Synthesis Design for Stencil Computations on FPGA with High Bandwidth Memory
    Du, Changdao
    Yamaguchi, Yoshiki
    ELECTRONICS, 2020, 9 (08) : 1 - 19
  • [20] Performance modeling and optimization of 3-D stencil computation on a stream-based FPGA accelerator
    Dohi, Keisuke
    Fukumoto, Kota
    Shibata, Yuichiro
    Oguri, Kiyoshi
    2013 INTERNATIONAL CONFERENCE ON RECONFIGURABLE COMPUTING AND FPGAS (RECONFIG), 2013,