Influence of Tasks Duration Variability on Task-Based Runtime Schedulers

被引:3
|
作者
Beaumont, Olivier [1 ]
Eyraud-Dubois, Lionel [1 ]
Gao, Yihong [1 ]
机构
[1] Univ Bordeaux, CNRS, Bordeaux INP, INRIA,LaBRI, Talence, France
关键词
D O I
10.1109/IPDPSW.2019.00013
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the context of HPC platforms, individual nodes nowadays consist of heterogenous processing resources such as GPU units and multicores. Those resources share communication and storage resources, inducing complex co-scheduling effects, and making it hard to predict the exact duration of a task or of a communication. To cope with these issues, runtime dynamic schedulers such as STARPU have been developed. These systems base their decisions at runtime on the state of the platform and possibly on static priorities of tasks computed offline. In this paper, our goal is to quantify performance variability in the context of HPC heterogeneous nodes, by focusing on very regular dense linear algebra kernels, such as Cholesky and LU factorizations. We therefore first concentrate on the evaluation of the individual block-size kernels variability. Then, we analyze the impact of this variability at the scale of a full application on a dynamic runtime scheduler such as STARPU, in order to analyze whether the strategies that have been designed in the context of MapReduce applications to cope with stragglers could be transferred to HPC systems, or if the dynamic nature of runtime schedulers is enough to cope with actual performance variations, even in presence of task dependencies.
引用
收藏
页码:16 / 25
页数:10
相关论文
共 50 条
  • [1] A Hardware Runtime for Task-Based Programming Models
    Tan, Xubin
    Bosch, Jaume
    Alvarez, Carlos
    Jimenez-Gonzalez, Daniel
    Ayguade, Eduard
    Valero, Mateo
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (09) : 1932 - 1946
  • [2] Mitigating the NUMA effect on task-based runtime systems
    Maronas, Marcos
    Navarro, Antoni
    Ayguade, Eduard
    Beltran, Vicenc
    [J]. JOURNAL OF SUPERCOMPUTING, 2023, 79 (13): : 14287 - 14312
  • [3] Mitigating the NUMA effect on task-based runtime systems
    Marcos Maroñas
    Antoni Navarro
    Eduard Ayguadé
    Vicenç Beltran
    [J]. The Journal of Supercomputing, 2023, 79 : 14287 - 14312
  • [4] Fast approximation algorithms for task-based runtime systems
    Beaumont, Olivier
    Eyraud-Dubois, Lionel
    Kumar, Suraj
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2018, 30 (17):
  • [5] Implementing the Broadcast Operation in a Distributed Task-based Runtime
    Ceccato, Rodrigo
    Yviquel, Herve
    Pereira, Marcio
    Souza, Alan
    Araujo, Guido
    [J]. 2022 IEEE 34TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING WORKSHOPS (SBAC-PADW 2022), 2022, : 25 - 32
  • [6] Flexible Data Redistribution in a Task-Based Runtime System
    Cao, Qinglei
    Bosilca, George
    Wu, Wei
    Zhong, Dong
    Bouteiller, Aurelien
    Dongarra, Jack
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER 2020), 2020, : 221 - 225
  • [7] Asynchronous runtime with distributed manager for task-based programming models
    Bosch, Jaume
    Alvarez, Carlos
    Jimenez-Gonzalez, Daniel
    Martorell, Xavier
    Ayguade, Eduard
    [J]. PARALLEL COMPUTING, 2020, 97
  • [8] Evaluating Dynamic Task Scheduling in a Task-Based Runtime System for Heterogeneous Architectures
    Becker, Thomas
    Karl, Wolfgang
    Schuele, Tobias
    [J]. ARCHITECTURE OF COMPUTING SYSTEMS - ARCS 2019, 2019, 11479 : 142 - 155
  • [9] Deploying a Task-based Runtime System on Raspberry Pi Clusters
    Gupta, Nikunj
    Brandt, Steve R.
    Wagle, Bibek
    Wu, Nanmiao
    Kheirkhahan, Alireza
    Diehl, Patrick
    Baumann, Felix W.
    Kaiser, Hartmut
    [J]. PROCEEDINGS OF 2020 IEEE/ACM FIFTH INTERNATIONAL WORKSHOP ON EXTREME SCALE PROGRAMMING MODELS AND MIDDLEWARE (ESPM2 2020), 2020, : 11 - 20
  • [10] Design for a Soft Error Resilient Dynamic Task-based Runtime
    Cao, Chongxiao
    Herault, Thomas
    Bosilca, George
    Dongarra, Jack
    [J]. 2015 IEEE 29TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2015, : 765 - 774