Efficient Intranode Communication in GPU-Accelerated Systems

被引:1
|
作者
Ji, Feng [1 ]
Aji, Ashwin M. [2 ]
Dinan, James [3 ]
Buntinas, Darius [3 ]
Balaji, Pavan [3 ]
Feng, Wu-chun [2 ]
Ma, Xiaosong [1 ,4 ]
机构
[1] North Carolina State Univ, Dept Comp Sci, Raleigh, NC 27695 USA
[2] Virginia Tech, Dept Comp Sci, Blacksburg, VA USA
[3] Argonne Natl Lab, Math & Comp Sci Div, Argonne, IL 60439 USA
[4] Oak Ridge Natl Lab, Div Math & Comp Sci, Oak Ridge, TN USA
基金
美国国家科学基金会;
关键词
IMPLEMENTATION;
D O I
10.1109/IPDPSW.2012.227
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Current implementations of MPI are unaware of accelerator memory (i.e., GPU device memory) and require programmers to explicitly move data between memory spaces. This approach is inefficient, especially for intranode communication where it can result in several extra copy operations. In this work, we integrate GPU-awareness into a popular MPI runtime system and develop techniques to significantly reduce the cost of intranode communication involving one or more GPUs. Experiment results show an up to 2x increase in bandwidth, resulting in an average of 4.3% improvement to the total execution time of a halo exchange benchmark.
引用
收藏
页码:1838 / 1847
页数:10
相关论文
共 50 条
  • [21] GPU-accelerated Path Rendering
    Kilgard, Mark J.
    Bolz, Jeff
    ACM TRANSACTIONS ON GRAPHICS, 2012, 31 (06):
  • [22] GPU-Accelerated Charge Mapping
    Sanaullah, Ahmed
    Mojumder, Saiful A.
    Lewis, Kathleen M.
    Herbordt, Martin C.
    2016 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2016,
  • [23] GPU-accelerated Evolutionary Design of the Complete Exchange Communication on Wormhole Networks
    Jaros, Jiri
    Tyrala, Radek
    GECCO'14: PROCEEDINGS OF THE 2014 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2014, : 1023 - 1030
  • [24] Empirical Study on the GPU-accelerated HPL Performance: Effects of PCIe Communication
    Choi, Jieun
    Jeong, Yosang
    Kang, Ji-Hoon
    Gu, Gibeom
    Ryu, Hoon
    2022 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER 2022), 2022, : 496 - 497
  • [25] Challenges in GPU-Accelerated Nonlinear Dynamic Analysis for Structural Systems
    Simpson, Barbara G.
    Zhu, Minjie
    Seki, Akiri
    Scott, Michael
    JOURNAL OF STRUCTURAL ENGINEERING, 2023, 149 (03)
  • [26] Computation-Communication Overlap of Linpack on a GPU-Accelerated PC Cluster
    Ohmura, Junichi
    Miyoshi, Takefumi
    Irie, Hidetsugu
    Yoshinaga, Tsutomu
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2011, E94D (12): : 2319 - 2327
  • [27] Weighted Block-Asynchronous Iteration on GPU-Accelerated Systems
    Anzt, Hartwig
    Tomov, Stanimire
    Dongarra, Jack
    Heuveline, Vincent
    EURO-PAR 2012: PARALLEL PROCESSING WORKSHOPS, 2013, 7640 : 145 - 154
  • [28] A New LU Decomposition on Hybrid GPU-Accelerated Multicore Systems
    Eduardo Gonzalez, Hector
    Carmona, Juan
    COMPUTACION Y SISTEMAS, 2013, 17 (03): : 413 - 422
  • [29] GPU-Accelerated Monte Carlo Simulations of Dense Stellar Systems
    Pattabiraman, B.
    Umbreit, S.
    Liao, W.
    Rasio, F.
    Kalogera, V.
    Choudhary, A.
    ADVANCES IN COMPUTATIONAL ASTROPHYSICS: METHODS, TOOLS AND OUTCOMES, 2012, 453 : 329 - 332
  • [30] Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems
    Anzt, Hartwig
    Tomov, Stanimire
    Gates, Mark
    Dongarra, Jack
    Heuveline, Vincent
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, ICCS 2012, 2012, 9 : 7 - 16