Simulation-based optimization and sensibility analysis of MPI applications: Variability matters

被引:2
|
作者
Cornebize, Tom [1 ]
Legrand, Arnaud [1 ]
机构
[1] Univ Grenoble Alpes, CNRS, Inria, Grenoble INP,LIG, F-38000 Grenoble, France
关键词
Simulation; Validation; Sensibility analysis; SimGrid; HPL;
D O I
10.1016/j.jpdc.2022.04.002
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Finely tuning MPI applications and understanding the influence of key parameters (number of processes, granularity, collective operation algorithms, virtual topology, and process placement) is critical to obtain good performance on supercomputers. With the high consumption of running applications at scale, doing so solely to optimize their performance is particularly costly. Having inexpensive but faithful predictions of expected performance could be a great help for researchers and system administrators. The methodology we propose decouples the complexity of the platform, which is captured through statistical models of the performance of its main components (MPI communications, BLAS operations), from the complexity of adaptive applications by emulating the application and skipping regular non-MPI parts of the code. We demonstrate the capability of our method with High-Performance Linpack (HPL), the benchmark used to rank supercomputers in the TOP500, which requires careful tuning. We briefly present (1) how the open-source version of HPL can be slightly modified to allow a fast emulation on a single commodity server at the scale of a supercomputer. Then we present (2) an extensive (in)validation study that compares simulation with real experiments and demonstrates our ability to predict the performance of HPL within a few percent consistently. This study allows us to identify the main modeling pitfalls (e.g., spatial and temporal node variability or network heterogeneity and irregular behavior) that need to be considered. Last, we show (3) how our "surrogate" allows studying several subtle HPL parameter optimization problems while accounting for uncertainty on the platform. (c) 2022 Elsevier Inc. All rights reserved.
引用
收藏
页码:111 / 125
页数:15
相关论文
共 50 条
  • [1] Self-optimizing MPI applications: A simulation-based approach
    Mancini, EP
    Rak, M
    Torella, R
    Villano, U
    HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, PROCEEDINGS, 2005, 3726 : 143 - 155
  • [2] Need for Simulation-Based Design Analysis and Optimization
    Krishnamoorthy, S.
    Bedekar, A.S.
    Feng, J.J.
    Sundaram, S.
    JALA - Journal of the Association for Laboratory Automation, 2006, 11 (03): : 118 - 127
  • [3] Simulation-based Analysis for GEAR Performance Optimization
    Maarouf, Ismat K.
    Sheltami, Tarek R.
    2008 22ND INTERNATIONAL WORKSHOPS ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS, VOLS 1-3, 2008, : 910 - 915
  • [4] Simulation-based optimization
    Law, AM
    McComas, MG
    PROCEEDINGS OF THE 2000 WINTER SIMULATION CONFERENCE, VOLS 1 AND 2, 2000, : 46 - 49
  • [5] Simulation-based optimization
    Law, AM
    McComas, MG
    PROCEEDINGS OF THE 2002 WINTER SIMULATION CONFERENCE, VOLS 1 AND 2, 2002, : 41 - 44
  • [6] Simulation-based optimization of multiple-task GRID applications
    Mancini, E. P.
    Villano, U.
    Rak, M.
    Moscatob, F.
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2008, 24 (06): : 594 - 604
  • [7] A new eRobotics approach to simulation-based analysis and optimization
    Atorf, Linus
    Kaigom, Eric Guiffo
    Rossmann, Juergen
    2013 SIXTH INTERNATIONAL CONFERENCE ON DEVELOPMENTS IN ESYSTEMS ENGINEERING (DESE), 2014, : 15 - 20
  • [8] Simulation-based analysis and optimization of polymer spin packs
    1600, Deutscher Fachverlag GmbH (66):
  • [9] Product Design Optimization With Simulation-Based Reliability Analysis
    Pan, Rong
    Zhuang, Xiaotian
    Sun, Qing
    2012 INTERNATIONAL CONFERENCE ON QUALITY, RELIABILITY, RISK, MAINTENANCE, AND SAFETY ENGINEERING (ICQR2MSE), 2012, : 1028 - 1032
  • [10] Simulation-based optimization: Convergence analysis and statistical inference
    Shapiro, A.
    Communications in Statistics. Part C: Stochastic Models, 1996, 12 (03): : 425 - 454