SIMD Parallel Execution on GPU from High-Level Dataflow Synthesis

被引:1
|
作者
Bloch, Aurelien [1 ]
Brunet, Simone Casale [1 ]
Mattavelli, Marco [1 ]
机构
[1] Ecole Polytech Fed Lausanne, SCI, MM, STI, Lausanne, Switzerland
关键词
dynamic dataflow programs; RVC-CAL; SIMD parallel computing; source-to-source compiler; GPU programming; heterogeneous systems;
D O I
10.1109/MCSoC51149.2021.00017
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Writing and optimizing application software for heterogeneous platforms including GPU units is a very difficult task that requires designer efforts and resources to consider several key elements to obtain good performance. Dataflow programming has shown to be a good approach for accomplishing such a difficult task for its properties of portability and the possibility of arbitrary partitioning a dataflow network on each unit of heterogeneous platforms. However, such a design methodology is not sufficient by itself to obtain good performance. The paper describes some methodological steps for improving the performance of dataflow programs written in RVC-CAL and synthesized to execute on heterogeneous CPU/GPU co-processing platforms. The steps do include the optimization of the performance of the communication tasks between processing elements, a strategy for the efficient scheduling of independent GPU partitions, and the introduction of dynamic programming for leveraging the SIMD nature of GPU platforms. The approach is validated qualitatively and quantitatively using dataflow application program examples executed by applying several partitioning configurations.
引用
收藏
页码:62 / 68
页数:7
相关论文
共 50 条
  • [31] High-Level Dataflow Transformations Using Taylor Expansion Diagrams
    Ciesielski, Maciej
    Gomez-Prado, Daniel
    Guillot, Jeremie
    Boutillon, Emmanuel
    IEEE DESIGN & TEST OF COMPUTERS, 2009, 26 (04): : 46 - 57
  • [32] GPU-Accelerated High-Level Synthesis for Bitwidth Optimization of FPGA Datapaths
    Kapre, Nachiket
    Ye, Deheng
    PROCEEDINGS OF THE 2016 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS (FPGA'16), 2016, : 185 - 194
  • [33] Performance Estimation of High-Level Dataflow Program on Heterogeneous Platforms
    Bloch, Aurelien
    Brunet, Simone Casale
    Mattavelli, Marco
    2021 IEEE 14TH INTERNATIONAL SYMPOSIUM ON EMBEDDED MULTICORE/MANY-CORE SYSTEMS-ON-CHIP (MCSOC 2021), 2021, : 69 - 76
  • [34] A Streaming Dataflow Engine for Sparse Matrix-Vector Multiplication Using High-Level Synthesis
    Hosseinabady, Mohammad
    Nunez-Yanez, Jose Luis
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 39 (06) : 1272 - 1285
  • [35] Extracting high-level activities from low-level program execution logs
    Stepanov, Evgenii V.
    Mitsyuk, Alexey A.
    AUTOMATED SOFTWARE ENGINEERING, 2024, 31 (02)
  • [36] A High-Level Synthesis Library for Synthesizing Efficient and Functional-Safe CNN Dataflow Accelerators
    Filippas, Dionysios
    Peltekis, Christodoulos
    Titopoulos, Vasileios
    Kansizoglou, Ioannis
    Sirakoulis, Georgios CH.
    Gasteratos, Antonios
    Dimitrakopoulos, Giorgos
    IEEE ACCESS, 2024, 12 : 57194 - 57208
  • [37] Automatic Generation of Optimized and Synthesizable Hardware Implementation from High-Level Dataflow Programs
    Jerbi, Khaled
    Raulet, Mickael
    Deforges, Olivier
    Abid, Mohamed
    VLSI DESIGN, 2012, Hindawi Limited (2012)
  • [38] HIGH-LEVEL SYNTHESIS
    PAWLAK, A
    MICROPROCESSING AND MICROPROGRAMMING, 1992, 35 (1-5): : 261 - 261
  • [39] FROM BEHAVIOR TO STRUCTURE - HIGH-LEVEL SYNTHESIS
    CAMPOSANO, R
    IEEE DESIGN & TEST OF COMPUTERS, 1990, 7 (05): : 8 - 19
  • [40] Parallel high-level replacement systems
    Taentzer, G
    THEORETICAL COMPUTER SCIENCE, 1997, 186 (1-2) : 43 - 81