Enhancing MPI plus OpenMP Task Based Applications for Heterogeneous Architectures with GPU Support

被引:1
|
作者
Ferat, Manuel [1 ]
Pereira, Romain [2 ,4 ,5 ]
Roussel, Adrien [3 ,4 ]
Carribault, Patrick [3 ,4 ]
Steffenel, Luiz-Angelo [1 ]
Gautier, Thierry [5 ]
机构
[1] Univ Reims, LRC DIGIT, LICIIS, F-51097 Reims, France
[2] CEA, DAM, DIF, F-91297 Arpajon, France
[3] CEA, DAM, DIF, LRC DIGIT, F-91297 Arpajon, France
[4] Univ Paris Saclay, CEA, Lab Informat Haute Performance Calcul & Simulat, F-91680 Bruyeres Le Chatel, France
[5] ENS Lyon, LIP, Project Team AVALON INRIA, Lyon, France
关键词
OpenMP; GPU Computing; Distributed Application; Task programming;
D O I
10.1007/978-3-031-15922-0_1
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Heterogeneous supercomputers are widespread over HPC systems and programming efficient applications on these architectures is a challenge. Task-based programming models are a promising way to tackle this challenge. Since OpenMP 4.0 and 4.5, the target directives enable to offload pieces of code to GPUs and to express it as tasks with dependencies. Therefore, heterogeneous machines can be programmed using MPI+OpenMP(task+target) to exhibit a very high level of concurrent asynchronous operations for which data transfers, kernel executions, communications and CPU computations can be overlapped. Hence, it is possible to suspend tasks performing these asynchronous operations on the CPUs and to overlap their completion with another task execution. Suspended tasks can resume once the associated asynchronous event is completed in an opportunistic way at every scheduling point. We have integrated this feature into the MPC framework and validated it on a AXPY microbenchmark and evaluated on a MPI+OpenMP(tasks) implementation of the LULESH proxy applications. The results show that we are able to improve asynchronism and the overall HPC performance, allowing applications to benefit from asynchronous execution on heterogeneous machines.
引用
收藏
页码:3 / 16
页数:14
相关论文
共 50 条
  • [21] A Heterogeneous MPI plus PPL Task Scheduling Approach for Asynchronous Many-Task Runtime Systems
    Holmen, John K.
    Sahasrabudhe, Damodar
    Berzins, Martin
    PRACTICE AND EXPERIENCE IN ADVANCED RESEARCH COMPUTING 2021, PEARC 2021, 2021,
  • [22] Heterogeneous programming using OpenMP and CUDA/HIP for hybrid CPU-GPU scientific applications
    Tallada, Marc Gonzalez
    Morancho, Enric
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2023, 37 (05): : 626 - 646
  • [23] Adaptive Stochastic Gradient Descent for Deep Learning on Heterogeneous CPU plus GPU Architectures
    Ma, Yujing
    Rusu, Florin
    Wu, Kesheng
    Sim, Alexander
    2021 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2021, : 6 - 15
  • [24] TOPPER: An integrated environment for task allocation and execution of MPI applications onto parallel architectures
    Konstantinou, D
    Koziris, N
    ADVANCES IN INFORMATICS, 2003, 2563 : 336 - 350
  • [25] GPU-Based Embedded Intelligence Architectures and Applications
    Ang, Li Minn
    Seng, Kah Phooi
    ELECTRONICS, 2021, 10 (08)
  • [26] Enhancing Intra-Node GPU-to-GPU Performance in MPI plus UCX through Multi-Path Communication
    Sojoodi, Amirhossein
    Temucin, Yiltan Hassan
    Afsahi, Ahmad
    PROCEEDINGS OF 2024 3RD INTERNATIONAL WORKSHOP ON EXTREME HETEROGENEITY SOLUTIONS, EXHET 2024, 2024, : 9 - 14
  • [27] A Hybrid MPI plus OpenMP Solution of the Distributed Cluster-Based Fish Schooling Simulator
    Borges, Francisco
    Gutierrez-Milla, Albert
    Suppi, Remo
    Luque, Emilio
    2014 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, 2014, 29 : 2111 - 2120
  • [28] Algorithms for Scheduling Task-based Applications onto Heterogeneous Many-core Architectures
    Kinsy, Michel A.
    Devadas, Srinivas
    2014 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2014,
  • [29] sOMP: Simulating OpenMP Task-Based Applications with NUMA Effects
    Daoudi, Idriss
    Virouleau, Philippe
    Gautier, Thierry
    Thibault, Samuel
    Aumage, Olivier
    OPENMP: PORTABLE MULTI-LEVEL PARALLELISM ON MODERN SYSTEMS, 2020, 12295 : 197 - 211
  • [30] Efficient MPI-based Communication for GPU-Accelerated Dask Applications
    Shafi, Aamir
    Hashmi, Jahanzeb Maqbool
    Subramoni, Hari
    Panda, Dhabaleswar K.
    21ST IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2021), 2021, : 277 - 286