Using Dynamic Broadcasts to Improve Task-Based Runtime Performances

被引:6
|
作者
Denis, Alexandre [1 ,2 ]
Jeannot, Emmanuel [1 ,2 ]
Swartvagher, Philippe [1 ,2 ]
Thibault, Samuel [1 ,2 ]
机构
[1] Inria Bordeaux Sud Ouest, F-33405 Talence, France
[2] Univ Bordeaux, LaBRI, F-33405 Talence, France
来源
关键词
Task-based runtime systems; Communications; Collective; Broadcast;
D O I
10.1007/978-3-030-57675-2_28
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Task-based runtimes have emerged in the HPC world to take benefit from the computation power of heterogeneous supercomputers and to achieve scalability. One of the main bottlenecks for scalability is the communication layer. Some task-based algorithms need to send the same data to multiple nodes. To optimize this communication pattern, libraries propose dedicated routines, such as MPI_Bcast. However, MPI_Bcast requirements do not fit well with the constraints of task-based runtime systems: it must be performed simultaneously by all involved nodes, and these must know each other, which is not possible when each node runs a task scheduler not synchronized with others. In this paper, we propose a new approach, called dynamic broadcasts to overcome these constraints. The broadcast communication pattern required by the task-based algorithm is detected automatically, then the broadcasting algorithm relies on active messages and source routing, so that participating nodes do not need to know each other and do not need to synchronize. Receiver receives data the same way as it receives point-to-point communication, without having to know it arrives through a broadcast. We have implemented the algorithm in the STARPU runtime system using the NEWMADELEINE communication library. We performed benchmarks using the CHOLESKY factorization that is known to use broadcasts and observed up to 30% improvement of its total execution time.
引用
收藏
页码:443 / 457
页数:15
相关论文
共 50 条
  • [31] Runtime-Assisted Global Cache Management for Task-Based Parallel Programs
    Manivannan, Madhavan
    Pericas, Miquel
    Papaefstathiou, Vassilis
    Stenstrom, Per
    IEEE COMPUTER ARCHITECTURE LETTERS, 2017, 16 (02) : 145 - 148
  • [32] An Optimized Task-Based Runtime System for Resource-Constrained Parallel Accelerators
    Cesarini, Daniele
    Marongiu, Andrea
    Benini, Luca
    PROCEEDINGS OF THE 2016 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2016, : 1261 - 1266
  • [33] Automatic Parallelization to Asynchronous Task-Based Runtimes Through a Generic Runtime Layer
    Jin, Charles
    Baskaran, Muthu
    Meister, Benoit
    Springer, Jonathan
    2019 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2019,
  • [34] Visual Performance Analysis of Memory Behavior in a Task-Based Runtime on Hybrid Platforms
    Nesi, Lucas Leandro
    Thibault, Samuel
    Stanisic, Luka
    Schnorr, Lucas Mello
    2019 19TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2019, : 142 - 146
  • [35] Towards seismic wave modeling on heterogeneous many-core architectures using task-based runtime system
    Martinez, Victor
    Michea, David
    Dupros, Fabrice
    Aumage, Olivier
    Thibault, Samuel
    Aochi, Hideo
    Navaux, Philippe O. A.
    2015 27TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 2015, : 1 - 8
  • [36] Spot Market Cloud Orchestration Using Task-Based Redundancy and Dynamic Costing
    O'Neill, Vyas
    Soh, Ben
    FUTURE INTERNET, 2023, 15 (09):
  • [37] Bridging the Gap Between OpenMP and Task-Based Runtime Systems for the Fast Multipole Method
    Agullo, Emmanuel
    Aumage, Olivier
    Bramas, Berenger
    Coulaud, Olivier
    Pitoiset, Samuel
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2017, 28 (10) : 2794 - 2807
  • [38] Task-based dynamic fault tolerance for humanoid robots
    Murakami, Masayuki
    2006 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-6, PROCEEDINGS, 2006, : 2197 - 2202
  • [39] Distributed Task-based Runtime Systems - Current State and Micro-Benchmark Performance
    Hoque, Reazul
    Shamis, Pavel
    IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2018, : 934 - 941
  • [40] Interpreting a dynamic and uncertain world: task-based control
    Howarth, RJ
    ARTIFICIAL INTELLIGENCE, 1998, 100 (1-2) : 5 - 85