H-SIMD machine: Configurable parallel computing for matrix multiplication

被引:0
|
作者
Xu, XZ [1 ]
Ziavras, SG [1 ]
机构
[1] New Jersey Inst Technol, Dept Elect & Comp Engn, Newark, NJ 07102 USA
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
FPGAs (Field-Programmable Gate Arrays) are often used as coprocessors to boost the performance of dataintensive applications [1, 2]. However, mapping algorithms onto multimillion-gate FPGAs is time consuming and remains a challenge in configurable system design. The communication overhead between the host workstation and the FPGAs is also significant. To address these problems, we propose in this paper the FPGA-based Hierarchical-SIMD (H-SIMD) machine with its codesign of the Hierarchical Instruction Set Architecture (HISA). At each level, HISA instructions are classified into communication instructions or computation instructions. The former are executed by the local controller while the latter are issued to the lower level for execution. Additionally, by using a memory switching scheme and the high-level HISA set to partition the application into coarse-grain tasks, the host-FPGA communication overhead can be hidden. We enlist matrix multiplication (MM) to test the effectiveness of HSIMD. The test results show sustained high performance.
引用
收藏
页码:671 / 676
页数:6
相关论文
共 50 条
  • [31] Analysis of parallel matrix multiplication algorithms
    Liu, WL
    Gu, YL
    DCABES 2004, PROCEEDINGS, VOLS, 1 AND 2, 2004, : 354 - 357
  • [33] Instability of parallel prefix matrix multiplication
    Mathias, R.
    Zeitschrift fuer Angewandte Mathematik und Mechanik, ZAMM, Applied Mathematics and Mechanics, 76 (Suppl 1):
  • [34] Parallel implementation of interval matrix multiplication
    Revol, Nathalie
    Théveny, Philippe
    Reliable Computing, 2013, 19 (01) : 91 - 106
  • [35] A Matrix-Matrix Multiplication methodology for single/multi-core architectures using SIMD
    Kelefouras, Vasilios
    Kritikakou, Angeliki
    Goutis, Costas
    JOURNAL OF SUPERCOMPUTING, 2014, 68 (03): : 1418 - 1440
  • [36] Breaking the performance bottleneck of sparse matrix-vector multiplication on SIMD processors
    Zhang, Kai
    Chen, Shuming
    Wang, Yaohua
    Wan, Jianghua
    IEICE ELECTRONICS EXPRESS, 2013, 10 (09):
  • [37] Using Machine Learning for Quality Configurable Approximate Computing
    Masadeh, Mahmoud
    Hasan, Osman
    Tahar, Sofiene
    2019 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2019, : 1575 - 1578
  • [38] Optimizing matrix multiplication for a short-vector SIMD architecture - CELL processor
    Kurzak, Jakub
    Alvaro, Wesley
    Dongarra, Jack
    PARALLEL COMPUTING, 2009, 35 (03) : 138 - 150
  • [39] Computing Krylov iterates in the time of matrix multiplication
    Neiger, Vincent
    Pernet, Clement
    Villard, Gilles
    PROCEEDINGS OF THE 2024 INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND ALGEBRAIC COMPUTATION, ISSAC 2024, 2024, : 419 - 428
  • [40] Effect of Mixed Precision Computing on H-Matrix Vector Multiplication in BEM Analysis
    Ooi, Rise
    Iwashita, Takeshi
    Fukaya, Takeshi
    Ida, Akihiro
    Yokota, Rio
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING IN ASIA-PACIFIC REGION (HPC ASIA 2020), 2020, : 92 - 101