FPGA architecture and implementation of sparse matrix-vector multiplication for the finite element method

被引:17
|
作者
Elkurdi, Yousef [1 ]
Fernandez, David [1 ]
Souleimanov, Evgueni [1 ]
Giannacopoulos, Dennis [1 ]
Gross, Warren J. [1 ]
机构
[1] McGill Univ, Dept Elect & Comp Engn, Montreal, PQ H3A 2A7, Canada
关键词
FPGA; SMVM; FEM;
D O I
10.1016/j.cpc.2007.11.014
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The Finite Element Method (FEM) is a computationally intensive scientific and engineering analysis tool that has diverse applications ranging from structural engineering to electromagnetic simulation. The trends in floating-point performance are moving in favor of Field-Progmmmable Gate Arrays (FPGAs), hence increasing interest has grown in the scientific community to exploit this technology. We present an architecture and implementation of an FPGA-based sparse matrix-vector multiplier (SMVM) for use in the iterative solution of large, sparse systems of equations arising from FEM applications. FEM matrices display specific sparsity patterns that can be exploited to improve the efficiency of hardware designs. Our architecture exploits FEM matrix sparsity structure to achieve a balance between performance and hardware resource requirements by relying on external SDRAM for data storage while utilizing the FPGAs computational resources in a stream-through systolic approach. The architecture is based on a pipelined linear array of processing elements (PEs) coupled with a hardware-oriented matrix striping algorithm and a partitioning scheme which enables it to process arbitrarily big matrices without changing the number of PEs in the architecture. Therefore, this architecture is only limited by the amount of external RAM available to the FPGA. The implemented SMVM-pipeline prototype contains 8 PEs and is clocked at 110 MHz obtaining a peak performance of 1.76 GFLOPS. For 8 GB/s of memory bandwidth typical of recent FPGA systems, this architecture can achieve 1.5 GFLOPS sustained performance. Using multiple instances of the pipeline, linear scaling of the peak and sustained performance can be achieved. Our stream-through architecture provides the added advantage of enabling an iterative implementation of the SMVM computation required by iterative solution techniques such as the conjugate gradient method, avoiding initialization time due to data loading and setup inside the FPGA internal memory. (c) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:558 / 570
页数:13
相关论文
共 50 条
  • [41] Load-balancing in sparse matrix-vector multiplication
    Nastea, SG
    Frieder, O
    ElGhazawi, T
    EIGHTH IEEE SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING, PROCEEDINGS, 1996, : 218 - 225
  • [42] Energy Evaluation of Sparse Matrix-Vector Multiplication on GPU
    Benatia, Akrem
    Ji, Weixing
    Wang, Yizhuo
    Shi, Feng
    2016 SEVENTH INTERNATIONAL GREEN AND SUSTAINABLE COMPUTING CONFERENCE (IGSC), 2016,
  • [43] Sparse Matrix-Vector Multiplication Based on Online Arithmetic
    Cherati, Sahar Moradi
    Jaberipur, Ghassem
    Sousa, Leonel
    IEEE ACCESS, 2024, 12 : 87653 - 87664
  • [44] Implementing Sparse Matrix-Vector Multiplication with QCSR on GPU
    Zhang, Jilin
    Liu, Enyi
    Wan, Jian
    Ren, Yongjian
    Yue, Miao
    Wang, Jue
    APPLIED MATHEMATICS & INFORMATION SCIENCES, 2013, 7 (02): : 473 - 482
  • [45] Sparse matrix-vector multiplication on network-on-chip
    Sun, C-C
    Goetze, J.
    Jheng, H-Y
    Ruan, S-J
    ADVANCES IN RADIO SCIENCE, 2010, 8 : 289 - 294
  • [46] Communication balancing in parallel sparse matrix-vector multiplication
    Bisseling, RH
    Meesen, W
    ELECTRONIC TRANSACTIONS ON NUMERICAL ANALYSIS, 2005, 21 : 47 - 65
  • [47] Optimization by Runtime Specialization for Sparse Matrix-Vector Multiplication
    Kamin, Sam
    Garzaran, Maria Jesus
    Aktemur, Baris
    Xu, Danqing
    Yilmaz, Buse
    Chen, Zhongbo
    ACM SIGPLAN NOTICES, 2015, 50 (03) : 93 - 102
  • [48] A new approach for accelerating the sparse matrix-vector multiplication
    Tvrdik, Pavel
    Simecek, Ivan
    SYNASC 2006: EIGHTH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING, PROCEEDINGS, 2007, : 156 - +
  • [49] Parallel Sparse Matrix-Vector Multiplication Using Accelerators
    Maeda, Hiroshi
    Takahashi, Daisuke
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2016, PT II, 2016, 9787 : 3 - 18
  • [50] Adaptive diagonal sparse matrix-vector multiplication on GPU
    Gao, Jiaquan
    Xia, Yifei
    Yin, Renjie
    He, Guixia
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2021, 157 : 287 - 302