SIMD Parallel Execution on GPU from High-Level Dataflow Synthesis

被引：1

作者：

Bloch, Aurelien ^{[1
]}

Brunet, Simone Casale ^{[1
]}

Mattavelli, Marco ^{[1
]}

机构：

[1] Ecole Polytech Fed Lausanne, SCI, MM, STI, Lausanne, Switzerland

来源：

2021 IEEE 14TH INTERNATIONAL SYMPOSIUM ON EMBEDDED MULTICORE/MANY-CORE SYSTEMS-ON-CHIP (MCSOC 2021) | 2021年

关键词：

dynamic dataflow programs; RVC-CAL; SIMD parallel computing; source-to-source compiler; GPU programming; heterogeneous systems;

D O I：

10.1109/MCSoC51149.2021.00017

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Writing and optimizing application software for heterogeneous platforms including GPU units is a very difficult task that requires designer efforts and resources to consider several key elements to obtain good performance. Dataflow programming has shown to be a good approach for accomplishing such a difficult task for its properties of portability and the possibility of arbitrary partitioning a dataflow network on each unit of heterogeneous platforms. However, such a design methodology is not sufficient by itself to obtain good performance. The paper describes some methodological steps for improving the performance of dataflow programs written in RVC-CAL and synthesized to execute on heterogeneous CPU/GPU co-processing platforms. The steps do include the optimization of the performance of the communication tasks between processing elements, a strategy for the efficient scheduling of independent GPU partitions, and the introduction of dynamic programming for leveraging the SIMD nature of GPU platforms. The approach is validated qualitatively and quantitatively using dataflow application program examples executed by applying several partitioning configurations.

引用

页码：62 / 68

页数：7

共 50 条

[41] High-level parallel computing language
Zhou, JF
Yang, Y
Su, Y
OPTIMIZING SCIENTIFIC RETURN FOR ASTRONOMY THROUGH INFORMATION TECHNOLOGIES, 2004, 5493 : 530 - 537
[42] Efficient high-level parallel programming
Botorog, GH
Kuchen, H
THEORETICAL COMPUTER SCIENCE, 1998, 196 (1-2) : 71 - 107
[43] High-level Synthesis of Memory Bound and Irregular Parallel Applications with Bambu
Castellana, Vito Giovanni
Tumeo, Antonino
Ferrandi, Fabrizio
2014 IEEE HOT CHIPS 26 SYMPOSIUM (HCS), 2014,
[44] synASM: A High-Level Synthesis Framework With Support for Parallel and Timed Constructs
Sinha, Rohit
Patel, Hiren D.
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2012, 31 (10) : 1508 - 1521
[45] Parallel algorithms for simultaneous scheduling, binding and floorplanning in high-level synthesis
Prabhakaran, P
Banerjee, P
ISCAS '98 - PROCEEDINGS OF THE 1998 INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-6, 1998, : E372 - E376
[46] High-Level Synthesis of Parallel Specifications Coupling Static and Dynamic Controllers
Castellana, Vito Giovanni
Tumeo, Antonino
Ferrandi, Fabrizio
2021 IEEE 35TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2021, : 192 - 202
[47] HIGH-LEVEL INTERPRETATION OF EXECUTION TRACES OF ADA TASKS
CAILLET, JF
BONNET, C
RAITHER, B
LECTURE NOTES IN COMPUTER SCIENCE, 1987, 289 : 309 - 317
[48] High-Level GPU Multi-Purpose Profiler
Rotariu, Marian-Cristian
Apostol, Elena
2013 EIGHTH INTERNATIONAL CONFERENCE ON P2P, PARALLEL, GRID, CLOUD AND INTERNET COMPUTING (3PGCIC 2013), 2013, : 549 - 553
[49] Automatic Parallelism through Macro Dataflow in High-level Array Languages
Ratnalikar, Pushkar
Chauhan, Arun
PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT'14), 2014, : 489 - 490
[50] Array-Specific Dataflow Caches for High-Level Synthesis of Memory-Intensive Algorithms on FPGAs
Brignone, Giovanni
Jamal, M. Usman
Lazarescu, Mihai T.
Lavagno, Luciano
IEEE ACCESS, 2022, 10 : 118858 - 118877

← 1 2 3 4 5 →