Fine-Grained MPI plus OpenMP Plasma Simulations: Communication Overlap with Dependent Tasks

被引：3

作者：

Richard, Jerome ^{[1
,2
]}

Latu, Guillaume ^{[1
]}

Bigot, Julien ^{[3
]}

Gautier, Thierry ^{[4
]}

机构：

[1] CEA, IRFM, F-13108 St Paul Les Durance, France

[2] Zebrys, Toulouse, France

[3] Univ Paris Saclay, UVSQ, Univ Paris Sud, Maison Simulat,CEA,CNRS, Gif Sur Yvette, France

[4] Univ Lyon, INRIA, CNRS, ENS Lyon,Univ Claude Bernard Lyon 1,LIP, Lyon, France

来源：

EURO-PAR 2019: PARALLEL PROCESSING | 2019年 / 11725卷

关键词：

Dependent tasks; OpenMP; 4.5; MPI; Many-core;

D O I：

10.1007/978-3-030-29400-7_30

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

This paper demonstrates how OpenMP 4.5 tasks can be used to efficiently overlap computations and MPI communications based on a case-study conducted on multi-core and many-core architectures. It focuses on task granularity, dependencies and priorities, and also identifies some limitations of OpenMP. Results on 64 Skylake nodes show that while 64% of the wall-clock time is spent in MPI communications, 60% of the cores are busy in computations, which is a good result. Indeed, the chosen dataset is small enough to be a challenging case in terms of overlap and thus useful to assess worst-case scenarios in future simulations. Two key features were identified: by using task priority we improved the performance by 5.7% (mainly due to an improved overlap), and with recursive tasks we shortened the execution time by 9.7%. We also illustrate the need to have access to tools for task tracing and task visualization. These tools allowed a fine understanding and a performance increase for this task-based OpenMP+MPI code.

引用

页码：419 / 433

页数：15

共 50 条

[1] Support for fine grained dependent tasks in OpenMP
Sinnen, Oliver
Pe, Jsun
Kozlov, Alexei Vladimirovich
PRACTICAL PROGRAMMING MODEL FOR THE MULTI-CORE ERA, PROCEEDINGS, 2008, 4935 : 13 - 24
[2] Efficient Communication/Computation Overlap with MPI plus OpenMP Runtimes Collaboration
Sergent, Marc
Dagrada, Mario
Carribault, Patrick
Jaeger, Julien
Perache, Marc
Papaure, Guillaume
EURO-PAR 2018: PARALLEL PROCESSING, 2018, 11014 : 560 - 572
[3] Exploiting Fine-Grained Parallelism in the Phylogenetic Likelihood Function with MPI, Pthreads, and OpenMP: A Performance Study
Stamatakis, Alexandros
Ott, Michael
PATTERN RECOGNITION IN BIOINFORMATICS, PROCEEDINGS, 2008, 5265 : 424 - +
[4] Real Asynchronous MPI Communication in Hybrid Codes through OpenMP Communication Tasks
Buettner, David
Acquaviva, Jean-Thomas
Weidendorfer, Josef
2013 19TH IEEE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS 2013), 2013, : 208 - 215
[5] Fine-Grained Tasks for Crowdsourced Entity Resolution
Nie, Tiezheng
Mao, Hanyu
Liu, Xin
Yu, Sining
APPLIED SCIENCES-BASEL, 2025, 15 (01):
[6] FINE-GRAINED MULTITHREADING SUPPORT FOR HYBRID THREADED MPI PROGRAMMING
Balaji, Pavan
Buntinas, Darius
Goodell, David
Gropp, William
Thakur, Rajeev
INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2010, 24 (01): : 49 - 57
[7] Fine-grained adaptive parallelism for automotive systems through AMALTHEA and OpenMP
Munera, Adrian
Royuela, Sara
Pressler, Michael
Mackamul, Harald
Ziegenbein, Dirk
Quinones, Eduardo
JOURNAL OF SYSTEMS ARCHITECTURE, 2024, 146
[8] Fine-grained simulations of the microenvironment of vascularized tumours
Fredrich, Thierry
Rieger, Heiko
Chignola, Roberto
Milotti, Edoardo
SCIENTIFIC REPORTS, 2019, 9 (1)
[9] Fine-grained simulations of the microenvironment of vascularized tumours
Thierry Fredrich
Heiko Rieger
Roberto Chignola
Edoardo Milotti
Scientific Reports, 9
[10] Leveraging Multiple Tasks to Regularize Fine-Grained Classification
Dasgupta, Riddhiman
Namboodiri, Anoop M.
2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 3476 - 3481

← 1 2 3 4 5 →