A Heterogeneous MPI plus PPL Task Scheduling Approach for Asynchronous Many-Task Runtime Systems

被引:1
|
作者
Holmen, John K. [1 ]
Sahasrabudhe, Damodar [1 ]
Berzins, Martin [1 ]
机构
[1] Univ Utah, SCI Inst, Salt Lake City, UT 84112 USA
关键词
Asynchronous Many-Task Runtime System; Performance Portability; Parallelism and Concurrency; Portability; Software Engineering; PERFORMANCE;
D O I
10.1145/3437359.3465581
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Asynchronous many-task runtime systems and MPI+X hybrid parallelism approaches have shown promise for helping manage the increasing complexity of nodes in current and emerging high performance computing (HPC) systems, including those for exascale. The increasing architectural diversity of these systems, however, poses challenges for runtimes supporting more homogeneous HPC systems. Performance portability layers (PPL) have shown promise for helping manage this diversity. This paper describes a heterogeneous MPI+PPL task scheduling approach for combining these promising solutions with additional consideration for parallel third party libraries facing similar challenges to help prepare such a runtime for the diverse heterogeneous systems accompanying exascale computing. This approach is demonstrated using a heterogeneous MPI+Kokkos task scheduler and the accompanying portable abstractions [16] implemented in the Uintah Computational Framework, an asynchronous many-task runtime system, with additional consideration for hypre, a parallel third party library. Results are shown for two challenging problems executing workloads representative of typical Uintah applications. These results show performance improvements up to 4.4x when using this scheduler and the accompanying portable abstractions [16] to port a previously MPI-Only problem to Kokkos::OpenMP and Kokkos::CUDA to improve complex heterogeneous node use. Good strong-scaling to 1,024 NVIDIA V100 GPUs and 512 IBM POWER9 processor are also shown using MPI+Kokkos::OpenMP+Kokkos::CUDA at scale.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Understanding the Effect of Task Granularity on Execution Time in Asynchronous Many-Task Runtime Systems
    Shirzad, Shahrzad
    Tohid, R.
    Kheirkhahan, Alireza
    Wagle, Bibek
    Kaiser, Hartmut
    EURO-PAR 2021: PARALLEL PROCESSING WORKSHOPS, 2022, 13098 : 456 - 467
  • [2] Automatic Halo Management for the Uintah GPU-Heterogeneous Asynchronous Many-Task Runtime
    Peterson, Brad
    Humphrey, Alan
    Sunderland, Dan
    Sutherland, James
    Saad, Tony
    Dasari, Harish
    Berzins, Martin
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2019, 47 (5-6) : 1086 - 1116
  • [3] Automatic Halo Management for the Uintah GPU-Heterogeneous Asynchronous Many-Task Runtime
    Brad Peterson
    Alan Humphrey
    Dan Sunderland
    James Sutherland
    Tony Saad
    Harish Dasari
    Martin Berzins
    International Journal of Parallel Programming, 2019, 47 : 1086 - 1116
  • [4] Improving the Scaling of an Asynchronous Many-Task Runtime with a Lightweight Communication Engine
    Mor, Omri
    Bosilca, George
    Snir, Marc
    PROCEEDINGS OF THE 52ND INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2023, 2023, : 153 - 162
  • [5] In-Memory Runtime File Systems for Many-Task Computing
    Uta, Alexandru
    Sandu, Andreea
    Morozan, Ion
    Kielmann, Thilo
    ADAPTIVE RESOURCE MANAGEMENT AND SCHEDULING FOR CLOUD COMPUTING (ARMS-CC 2014), 2014, 8907 : 3 - 5
  • [6] Comparison Of Single Source Shortest Path Algorithms On Two Recent Asynchronous Many-task Runtime Systems
    Firoz, Jesun Sahariar
    Barnas, Martina
    Zalewski, Marcin
    Lumsdaine, Andrew
    2015 IEEE 21ST INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2015, : 674 - 681
  • [7] 3D Visualization of Asynchronous Many-Task Scheduling Algorithm
    Vasev P.A.
    Scientific Visualization, 2023, 15 (04): : 92 - 111
  • [8] Integrating Inter-Node Communication with a Resilient Asynchronous Many-Task Runtime System
    Paul, Sri Raj
    Hayashi, Akihiro
    Whitlock, Matthew
    Bak, Seonmyeong
    Teranishi, Keita
    Mayo, Jackson
    Grossman, Max
    Sarkar, Vivek
    PROCEEDINGS OF THE EXASCALE MPI WORKSHOP (EXAMPI 2020), 2020, : 41 - 51
  • [9] Runtime Task Scheduling Using Imitation Learning for Heterogeneous Many-Core Systems
    Krishnakumar, Anish
    Arda, Samet E.
    Goksoy, A. Alper
    Mandal, Sumit K.
    Ogras, Umit Y.
    Sartor, Anderson L.
    Marculescu, Radu
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 39 (11) : 4064 - 4077
  • [10] Enabling Resilience in Asynchronous Many-Task Programming Models
    Paul, Sri Raj
    Hayashi, Akihiro
    Slattengren, Nicole
    Kolla, Hemanth
    Whitlock, Matthew
    Bak, Seonmyeong
    Teranishi, Keita
    Mayo, Jackson
    Sarkar, Vivek
    EURO-PAR 2019: PARALLEL PROCESSING, 2019, 11725 : 346 - 360