Performance Analysis of a Hardware Accelerator of Dependence Management for Task-based Dataflow Programming models

被引:0
|
作者
Tan, Xubin [1 ]
Bosch, Jaume [1 ]
Jimenez-Gonzalez, Daniel [1 ]
Alvarez-Martinez, Carlos [1 ]
Ayguade, Eduard [1 ]
Valero, Mateo [1 ]
机构
[1] Univ Politecn Cataluna, Barcelona Supercomp Ctr, Barcelona, Spain
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Along with the popularity of multicore and manycore, task-based dataflow programming models obtain great attention for being able to extract high parallelism from applications without exposing the complexity to programmers. One of these pioneers is the OpenMP Superscalar (OmpSs). By implementing dynamic task dependence analysis, dataflow scheduling and out-of-order execution in runtime, OmpSs achieves high performance using coarse and medium granularity tasks. In theory, for the same application, the more parallel tasks can be exposed, the higher possible speedup can be achieved. Yet this factor is limited by task granularity, up to a point where the runtime overhead outweighs the performance increase and slows down the application. To overcome this handicap, Picos was proposed to support task-based dataflow programming models like OmpSs as a fast hardware accelerator for fine-grained task and dependence management, and a simulator was developed to perform design space exploration. This paper presents the very first functional hardware prototype inspired by Picos. An embedded system based on a Zynq 7000 All-Programmable SoC is developed to study its capabilities and possible bottlenecks. Initial scalability and hardware consumption studies of different Picos designs are performed to find the one with the highest performance and lowest hardware cost. A further thorough performance study is employed on both the prototype with the most balanced configuration and the OmpSs software-only alternative. Results show that our OmpSs runtime hardware support significantly outperforms the software-only implementation currently available in the runtime system for fine-grained tasks.
引用
收藏
页码:225 / 234
页数:10
相关论文
共 50 条
  • [1] General Purpose Task-Dependence Management Hardware for Task-based Dataflow Programming Models
    Tan, Xubin
    Bosch, Jaume
    Vidal, Miquel
    Alvarez, Carlos
    Jimenez-Gonzalez, Daniel
    Ayguade, Eduard
    Valero, Mateo
    [J]. 2017 31ST IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2017, : 244 - 253
  • [2] Picos, A Hardware Task-Dependence Manager for Task-based Dataflow Programming Models
    Tan, Xubin
    Bosch, Jaume
    Vidal, Miquel
    Alvarez, Carlos
    Jimenez-Gonzalez, Daniel
    Ayguade, Eduard
    Valero, Mateo
    [J]. 2017 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2017, : 878 - 880
  • [3] A Hardware Runtime for Task-Based Programming Models
    Tan, Xubin
    Bosch, Jaume
    Alvarez, Carlos
    Jimenez-Gonzalez, Daniel
    Ayguade, Eduard
    Valero, Mateo
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (09) : 1932 - 1946
  • [4] Nexus#: A Distributed Hardware Task Manager for Task-Based Programming Models
    Dallou, Tamer
    Elhossini, Ahmed
    Juurlink, Ben
    Engelhardt, Nina
    [J]. 2015 IEEE 29TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2015, : 1129 - 1138
  • [5] AMA: Asynchronous Management of Accelerators for Task-based Programming Models
    Planas, Judit
    Badia, Rosa M.
    Ayguade, Eduard
    Labarta, Jesus
    [J]. INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, ICCS 2015 COMPUTATIONAL SCIENCE AT THE GATES OF NATURE, 2015, 51 : 130 - 139
  • [6] A SURVEY OF TASK-BASED PARALLEL PROGRAMMING MODELS
    Li, Xin
    [J]. 3RD INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND COMPUTER SCIENCE (ITCS 2011), PROCEEDINGS, 2011, : 426 - 429
  • [7] Multi-GPU work sharing in a task-based dataflow programming model
    John, Joseph
    Milthorpe, Josh
    Herault, Thomas
    Bosilca, George
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 156 : 313 - 324
  • [8] A Current Task-Based Programming Paradigms Analysis
    Gurhem, Jerome
    Petiton, Serge G.
    [J]. COMPUTATIONAL SCIENCE - ICCS 2020, PT V, 2020, 12141 : 203 - 216
  • [9] Asynchronous runtime with distributed manager for task-based programming models
    Bosch, Jaume
    Alvarez, Carlos
    Jimenez-Gonzalez, Daniel
    Martorell, Xavier
    Ayguade, Eduard
    [J]. PARALLEL COMPUTING, 2020, 97
  • [10] Improving the Interoperability between MPI and Task-Based Programming Models
    Sala, Kevin
    Bellon, Jorge
    Farre, Pau
    Teruel, Xavier
    Perez, Josep M.
    Pena, Antonio J.
    Holmes, Daniel
    Beltran, Vicenc
    Labarta, Jesus
    [J]. EUROMPI 2018: PROCEEDINGS OF THE 25TH EUROPEAN MPI USERS' GROUP MEETING, 2018,