Fine-Grained MPI plus OpenMP Plasma Simulations: Communication Overlap with Dependent Tasks

被引:3
|
作者
Richard, Jerome [1 ,2 ]
Latu, Guillaume [1 ]
Bigot, Julien [3 ]
Gautier, Thierry [4 ]
机构
[1] CEA, IRFM, F-13108 St Paul Les Durance, France
[2] Zebrys, Toulouse, France
[3] Univ Paris Saclay, UVSQ, Univ Paris Sud, Maison Simulat,CEA,CNRS, Gif Sur Yvette, France
[4] Univ Lyon, INRIA, CNRS, ENS Lyon,Univ Claude Bernard Lyon 1,LIP, Lyon, France
来源
关键词
Dependent tasks; OpenMP; 4.5; MPI; Many-core;
D O I
10.1007/978-3-030-29400-7_30
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper demonstrates how OpenMP 4.5 tasks can be used to efficiently overlap computations and MPI communications based on a case-study conducted on multi-core and many-core architectures. It focuses on task granularity, dependencies and priorities, and also identifies some limitations of OpenMP. Results on 64 Skylake nodes show that while 64% of the wall-clock time is spent in MPI communications, 60% of the cores are busy in computations, which is a good result. Indeed, the chosen dataset is small enough to be a challenging case in terms of overlap and thus useful to assess worst-case scenarios in future simulations. Two key features were identified: by using task priority we improved the performance by 5.7% (mainly due to an improved overlap), and with recursive tasks we shortened the execution time by 9.7%. We also illustrate the need to have access to tools for task tracing and task visualization. These tools allowed a fine understanding and a performance increase for this task-based OpenMP+MPI code.
引用
收藏
页码:419 / 433
页数:15
相关论文
共 50 条
  • [31] Data for Image Recognition Tasks: An Efficient Tool for Fine-Grained Annotations
    Filax, Marco
    Gonschorek, Tim
    Ortmeier, Frank
    ICPRAM: PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS, 2019, : 900 - 907
  • [32] Fine-grained alignment of cryo-electron subtomograms based on MPI parallel optimization
    Lu, Yongchun
    Zeng, Xiangrui
    Zhao, Xiaofang
    Li, Shirui
    Li, Hua
    Gao, Xin
    Xu, Min
    BMC BIOINFORMATICS, 2019, 20 (01)
  • [33] Fine-grained alignment of cryo-electron subtomograms based on MPI parallel optimization
    Yongchun Lü
    Xiangrui Zeng
    Xiaofang Zhao
    Shirui Li
    Hua Li
    Xin Gao
    Min Xu
    BMC Bioinformatics, 20
  • [34] Safe and Effective Fine-grained TCP Retransmissions for Datacenter Communication
    Vasudevan, Vijay
    Phanishayee, Amar
    Shah, Hirai
    Krevat, Elie
    Andersen, David G.
    Ganger, Gregory R.
    Gibson, Garth A.
    Mueller, Brian
    SIGCOMM 2009, 2009, : 303 - 314
  • [35] CMB: A Configurable Messaging Benchmark to Explore Fine-Grained Communication
    Marts, W. Pepper
    Kruse, Donald A.
    Dosanjh, Matthew G. F.
    Schonbein, Whit
    Levy, Scott
    Bridges, Patrick G.
    2024 IEEE 24TH INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING, CCGRID 2024, 2024, : 28 - 38
  • [36] Performance Optimization and Modeling of Fine-Grained Irregular Communication in UPC
    Lagraviere, Jeremie
    Langguth, Johannes
    Prugger, Martina
    Einkemmer, Lukas
    Phuong Hoai Ha
    Cai, Xing
    SCIENTIFIC PROGRAMMING, 2019, 2019
  • [37] Visible Light Communication Technology for Fine-grained Indoor Localization
    Vieira, M.
    Vieira, M. A.
    Louro, P.
    Fantoni, A.
    Vieira, P.
    OPTICAL INTERCONNECTS XVIII, 2018, 10538
  • [38] Fine-Grained Bandwidth Estimation for Smart Grid Communication Network
    Luo, Jingtang
    Liao, Jingru
    Zhang, Chenlin
    Wang, Ziqi
    Zhang, Yuhang
    Xu, Jie
    Huang, Zhengwen
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2022, 32 (02): : 1225 - 1239
  • [39] Safe and Effective Fine-grained TCP Retransmissions for Datacenter Communication
    Vasudevan, Vijay
    Phanishayee, Amar
    Shah, Hiral
    Krevat, Elie
    Andersen, David G.
    Ganger, Gregory R.
    Gibson, Garth A.
    Mueller, Brian
    ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2009, 39 (04) : 303 - 314
  • [40] Unleashing Fine-Grained Parallelism on Embedded Many-Core Accelerators with Lightweight OpenMP Tasking
    Tagliavini, Giuseppe
    Cesarini, Daniele
    Marongiu, Andrea
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2018, 29 (09) : 2150 - 2163