Enhancing MPI plus OpenMP Task Based Applications for Heterogeneous Architectures with GPU Support

被引:1
|
作者
Ferat, Manuel [1 ]
Pereira, Romain [2 ,4 ,5 ]
Roussel, Adrien [3 ,4 ]
Carribault, Patrick [3 ,4 ]
Steffenel, Luiz-Angelo [1 ]
Gautier, Thierry [5 ]
机构
[1] Univ Reims, LRC DIGIT, LICIIS, F-51097 Reims, France
[2] CEA, DAM, DIF, F-91297 Arpajon, France
[3] CEA, DAM, DIF, LRC DIGIT, F-91297 Arpajon, France
[4] Univ Paris Saclay, CEA, Lab Informat Haute Performance Calcul & Simulat, F-91680 Bruyeres Le Chatel, France
[5] ENS Lyon, LIP, Project Team AVALON INRIA, Lyon, France
关键词
OpenMP; GPU Computing; Distributed Application; Task programming;
D O I
10.1007/978-3-031-15922-0_1
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Heterogeneous supercomputers are widespread over HPC systems and programming efficient applications on these architectures is a challenge. Task-based programming models are a promising way to tackle this challenge. Since OpenMP 4.0 and 4.5, the target directives enable to offload pieces of code to GPUs and to express it as tasks with dependencies. Therefore, heterogeneous machines can be programmed using MPI+OpenMP(task+target) to exhibit a very high level of concurrent asynchronous operations for which data transfers, kernel executions, communications and CPU computations can be overlapped. Hence, it is possible to suspend tasks performing these asynchronous operations on the CPUs and to overlap their completion with another task execution. Suspended tasks can resume once the associated asynchronous event is completed in an opportunistic way at every scheduling point. We have integrated this feature into the MPC framework and validated it on a AXPY microbenchmark and evaluated on a MPI+OpenMP(tasks) implementation of the LULESH proxy applications. The results show that we are able to improve asynchronism and the overall HPC performance, allowing applications to benefit from asynchronous execution on heterogeneous machines.
引用
收藏
页码:3 / 16
页数:14
相关论文
共 50 条
  • [31] On the Autotuning of Task-Based Numerical Libraries for Heterogeneous Architectures
    Agullo, Emmanuel
    Camara, Jesus
    Cuenca, Javier
    Gimenez, Domingo
    PARALLEL COMPUTING: TECHNOLOGY TRENDS, 2020, 36 : 157 - 166
  • [32] Evaluating Dynamic Task Scheduling in a Task-Based Runtime System for Heterogeneous Architectures
    Becker, Thomas
    Karl, Wolfgang
    Schuele, Tobias
    ARCHITECTURE OF COMPUTING SYSTEMS - ARCS 2019, 2019, 11479 : 142 - 155
  • [33] Runtime Support for Irregular Computation in MPI-Based Applications
    Zhao, Xin
    Balaji, Pavan
    Gropp, William
    2015 15TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING, 2015, : 701 - 704
  • [34] Quantifying the Potential Task-Based Dataflow Parallelism in MPI Applications
    Subotic, Vladimir
    Ferrer, Roger
    Sancho, Jose Carlos
    Labarta, Jesus
    Valero, Mateo
    EURO-PAR 2011 PARALLEL PROCESSING, PT 1, 2011, 6852 : 39 - 51
  • [35] A Sample-Based Dynamic CPU and GPU LLC Bypassing Method for Heterogeneous CPU-GPU Architectures
    Wang, Xin
    Zhang, Wei
    2017 16TH IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS / 11TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA SCIENCE AND ENGINEERING / 14TH IEEE INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS, 2017, : 753 - 760
  • [36] Detecting Non-sibling Dependencies in OpenMP Task-Based Applications
    Vieira, Ricardo Bispo
    Capra, Antoine
    Carribault, Patrick
    Jaeger, Julien
    Perache, Marc
    Roussel, Adrien
    OPENMP: CONQUERING THE FULL HARDWARE SPECTRUM, IWOMP 2019, 2019, 11718 : 231 - 245
  • [37] Energy Aware Task Mapping Algorithm For Heterogeneous MPSoC Based Architectures
    Hussien, Amr M. A.
    Eltawil, Ahmed M.
    Amin, Rahul
    Martin, Jim
    2011 IEEE 29TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD), 2011, : 449 - +
  • [38] Task-based multifrontal QR solver for GPU-accelerated multicore architectures
    Agullo, Emmanuel
    Buttari, Alfredo
    Guermouche, Abdou
    Lopez, Florent
    2015 IEEE 22ND INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2015, : 54 - 63
  • [39] Task Mapping and Scheduling for OpenVX Applications on Heterogeneous Multi/Many-Core Architectures
    Lumpp, Francesco
    Aldegheri, Stefano
    Patel, Hiren D.
    Bombieri, Nicola
    IEEE TRANSACTIONS ON COMPUTERS, 2021, 70 (08) : 1148 - 1159
  • [40] Observations on MPI-2 support for hybrid master/slave applications in dynamic and heterogeneous environments
    Leopold, Claudia
    Suess, Michael
    RECENT ADVANCES IN PARALLEL VIRTUAL MACHINE AND MESSAGE PASSING INTERFACE, 2006, 4192 : 285 - 292