Synchronizing MPI Processes in Space and Time

被引:1
|
作者
Schuchart, Joseph [1 ]
Hunold, Sascha [2 ]
Bosilca, George [1 ]
机构
[1] Univ Tennesse, Innovat Comp Lab, Knoxville, TN 37996 USA
[2] TU Wien, Vienna, Austria
基金
奥地利科学基金会;
关键词
MPI; collective communication; process synchronization; clock synchronization; OSU benchmarks; reduce; allreduce; broadcast; barrier; BENCHMARKING;
D O I
10.1145/3615318.3615325
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Performance benchmarks are an integral part of the development and evaluation of parallel algorithms, both in distributed applications as well as MPI implementations themselves. The initial step of the benchmark process is to obtain a common timestamp to mark the start of an operation across all involved processes, and the state-of-the-art in many applications and widely used MPI benchmark suites is the use of MPI barriers. In this paper, we show that the synchronization in space provided by an MPI_Barrier is insufficient for proper benchmark results of parallel distributed algorithms, using MPI collective operations as examples. The resulting lack of a global start timestamp for an operation leads to skewed results, with a significant impact of the used barrier algorithm. In order to mitigate these issues, we propose and discuss the implementation of MPIX_Harmonize, which extends the synchronization in space provided by MPI_Barrier with a time synchronization to guarantee a common starting timestamp across all involved processes. By replacing the use of MPI_Barrier with MPIX_Harmonize, benchmark implementors can eliminate skews resulting from barrier algorithms and achieve stable performance benchmark results. We will show that the proper time synchronization can have significant impact on the benchmark results for various implementations of MPI_Allreduce, MPI_Reduce, and MPI_Bcast.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Controlling and synchronizing space time chaos
    Boccaletti, S.
    Bragard, J.
    Arecchi, F.T.
    Physical Review E. Statistical Physics, Plasmas, Fluids, and Related Interdisciplinary Topics, 1999, 59 (06):
  • [2] Controlling and synchronizing space time chaos
    Boccaletti, S
    Bragard, J
    Arecchi, FT
    PHYSICAL REVIEW E, 1999, 59 (06): : 6574 - 6578
  • [3] State-Space Reduction of Non-deterministically Synchronizing Systems Applicable to Deadlock Detection in MPI
    Boehm, Stanislav
    Meca, Ondrej
    Jancar, Petr
    FM 2016: FORMAL METHODS, 2016, 9995 : 102 - 118
  • [4] SPECIFICATION OF SYNCHRONIZING PROCESSES
    RAMAMRITHAM, K
    KELLER, RM
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1983, 9 (06) : 722 - 733
  • [5] Intelligent Real-Time Scheduling of Dynamic Processes in MPI
    Moussa, Ahmed Shawky
    Embaby, Sherif AbdElazim
    Farag, Ibrahim
    2017 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2017,
  • [6] Abstracting and Counting Synchronizing Processes
    Ganjei, Zeinab
    Rezine, Ahmed
    Eles, Petru
    Peng, Zebo
    VERIFICATION, MODEL CHECKING, AND ABSTRACT INTERPRETATION (VMCAI 2015), 2015, 8931 : 227 - 244
  • [7] Counting dynamically synchronizing processes
    Ganjei, Zeinab
    Rezine, Ahmed
    Eles, Petru
    Peng, Zebo
    INTERNATIONAL JOURNAL ON SOFTWARE TOOLS FOR TECHNOLOGY TRANSFER, 2016, 18 (05) : 517 - 534
  • [8] Counting dynamically synchronizing processes
    Zeinab Ganjei
    Ahmed Rezine
    Petru Eles
    Zebo Peng
    International Journal on Software Tools for Technology Transfer, 2016, 18 : 517 - 534
  • [9] Synchronizing Objectives for Markov Decision Processes
    Doyen, Laurent
    Massart, Thierry
    Shirmohammadi, Mahsa
    ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE, 2011, (50): : 61 - 75
  • [10] Improved MPI collectives for MPI processes in shared address spaces
    Li, Shigang
    Hoefler, Torsten
    Hu, Chungjin
    Snir, Marc
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2014, 17 (04): : 1139 - 1155