Multi-FPGA Accelerator for Scalable Stencil Computation with Constant Memory Bandwidth

被引:75
|
作者
Sano, Kentaro [1 ]
Hatsuda, Yoshiaki [2 ]
Yamamoto, Satoru [1 ]
机构
[1] Tohoku Univ, Grad Sch Informat Sci, Sendai, Miyagi 980, Japan
[2] Kobo Co Ltd, Kawaguchi, Saitama, Japan
关键词
Scalable streaming-array; stencil computation; custom computing machine; FPGA; high-performance computation; MODEL;
D O I
10.1109/TPDS.2013.51
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Stencil computation is one of the important kernels in scientific computations. However, sustained performance is limited owing to restriction on memory bandwidth, especially on multicore microprocessors and graphics processing units (GPUs) because of their small operational intensity. In this paper, we present a custom computing machine (CCM), called a scalable streaming-array (SSA), for high-performance stencil computations with multiple field-programmable gate arrays (FPGAs). We design SSA based on a domain-specific programmable concept, where CCMs are programmable with the minimum functionality required for an algorithm domain. We employ a deep pipelining approach over successive iterations to achieve linear scalability for multiple devices with a constant memory bandwidth. Prototype implementation using nine FPGAs demonstrates good agreement with a performance model, and achieves 260 and 236 GFlop/s for 2D and 3D Jacobi computation, which are 87.4 and 83.9 percent of the peak, respectively, with a memory bandwidth of only 2.0 GB/s. We also evaluate the performance of SSA for state-of-the-art FPGAs.
引用
收藏
页码:695 / 705
页数:11
相关论文
共 38 条
  • [1] Multi-FPGA Accelerator Architecture for Stencil Computation Exploiting Spacial and Temporal Scalability
    Waidyasooriya, Hasitha Muthumala
    Hariyama, Masanori
    IEEE ACCESS, 2019, 7 : 53188 - 53201
  • [2] GraFF: A Multi-FPGA System with Memory Semantic Fabric for Scalable Graph Processing
    Zhang, Xu
    Chang, Yisong
    Lu, Tianyue
    Liu, Ke
    Zhang, Ke
    Chen, Mingyu
    2022 21ST INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (ICFPT 2022), 2022, : 308 - 309
  • [3] GraFF: A Multi-FPGA System with Memory Semantic Fabric for Scalable Graph Processing
    Zhang, Xu
    Chang, Yisong
    Lu, Tianyue
    Liu, Ke
    Zhang, Ke
    Chen, Mingyu
    FPT 2022 - 21st International Conference on Field-Programmable Technology, Proceedings, 2022,
  • [4] A Multi-FPGA Accelerator for Dose Calculation in Radiation Therapy
    Zhou, B.
    Hu, X. S.
    Chen, D. Z.
    Yu, C.
    MEDICAL PHYSICS, 2009, 36 (06)
  • [5] RP-Ring: A Heterogeneous Multi-FPGA Accelerator
    Guo, Shuaizhi
    Wang, Tianqi
    Tao, Linfeng
    Tian, Teng
    Xiang, Zikun
    Jin, Xi
    INTERNATIONAL JOURNAL OF RECONFIGURABLE COMPUTING, 2018, 2018
  • [6] A Scalable Multi-FPGA Platform for Complex Networking Applications
    Muehlbach, Sascha
    Koch, Andreas
    2011 IEEE 19TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2011, : 81 - 84
  • [7] Multi-FPGA systems synthesis by means of evolutionary computation
    Hidalgo, JI
    Fernández, F
    Lanchares, J
    Sánchez, JM
    Hermida, R
    Tomassini, M
    Baraglia, R
    Perego, R
    Garnica, O
    GENETIC AND EVOLUTIONARY COMPUTATION - GECCO 2003, PT II, PROCEEDINGS, 2003, 2724 : 2109 - 2120
  • [8] A Multi-FPGA Accelerator for Radiation Dose Calculation in Cancer Treatment
    Zhou, Bo
    Hu, X. Sharon
    Chen, Danny Z.
    Yu, Cedric X.
    2009 IEEE 7TH SYMPOSIUM ON APPLICATION SPECIFIC PROCESSORS (SASP 2009), 2009, : 70 - +
  • [9] A power efficient linear equation solver on a multi-FPGA accelerator
    Sudarsanam A.
    Hauser T.
    Dasu A.
    Young S.
    International Journal of Computers and Applications, 2010, 32 (01) : 56 - 72
  • [10] SPARK: A Scalable Partitioning and Routing Framework for Multi-FPGA Systems
    Zang, Xinshi
    Young, Evangeline F. Y.
    Wong, Martin D. F.
    PROCEEDINGS OF THE GREAT LAKES SYMPOSIUM ON VLSI 2023, GLSVLSI 2023, 2023, : 593 - 598