Efficient Intranode Communication in GPU-Accelerated Systems

被引:1
|
作者
Ji, Feng [1 ]
Aji, Ashwin M. [2 ]
Dinan, James [3 ]
Buntinas, Darius [3 ]
Balaji, Pavan [3 ]
Feng, Wu-chun [2 ]
Ma, Xiaosong [1 ,4 ]
机构
[1] North Carolina State Univ, Dept Comp Sci, Raleigh, NC 27695 USA
[2] Virginia Tech, Dept Comp Sci, Blacksburg, VA USA
[3] Argonne Natl Lab, Math & Comp Sci Div, Argonne, IL 60439 USA
[4] Oak Ridge Natl Lab, Div Math & Comp Sci, Oak Ridge, TN USA
基金
美国国家科学基金会;
关键词
IMPLEMENTATION;
D O I
10.1109/IPDPSW.2012.227
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Current implementations of MPI are unaware of accelerator memory (i.e., GPU device memory) and require programmers to explicitly move data between memory spaces. This approach is inefficient, especially for intranode communication where it can result in several extra copy operations. In this work, we integrate GPU-awareness into a popular MPI runtime system and develop techniques to significantly reduce the cost of intranode communication involving one or more GPUs. Experiment results show an up to 2x increase in bandwidth, resulting in an average of 4.3% improvement to the total execution time of a halo exchange benchmark.
引用
收藏
页码:1838 / 1847
页数:10
相关论文
共 50 条
  • [1] DMA-Assisted, Intranode Communication in GPU Accelerated Systems
    Ji, Feng
    Aji, Ashwin M.
    Dinan, James
    Buntinas, Darius
    Balaji, Pavan
    Thakur, Rajeev
    Feng, Wu-Chun
    Ma, Xiaosong
    2012 IEEE 14TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2012 IEEE 9TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (HPCC-ICESS), 2012, : 461 - 468
  • [2] Efficient MPI-based Communication for GPU-Accelerated Dask Applications
    Shafi, Aamir
    Hashmi, Jahanzeb Maqbool
    Subramoni, Hari
    Panda, Dhabaleswar K.
    21ST IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2021), 2021, : 277 - 286
  • [3] Cooperative multitasking for GPU-accelerated grid systems
    Ino, Fumihiko
    Ogita, Akihiro
    Oita, Kentaro
    Hagihara, Kenichi
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2012, 24 (01): : 96 - 107
  • [4] Resource Sharing in GPU-accelerated Windowing Systems
    Kato, Shinpei
    Lakshmanan, Karthik
    Ishikawa, Yutaka
    Rajkumar, Ragunathan
    17TH IEEE REAL-TIME AND EMBEDDED TECHNOLOGY AND APPLICATIONS SYMPOSIUM (RTAS 2011), 2011, : 191 - 200
  • [5] GPU-Accelerated Microdosimetry
    Decunha, J.
    Mohan, R.
    MEDICAL PHYSICS, 2022, 49 (06) : E467 - E468
  • [6] Efficient GPU-Accelerated Computation of Isosurface Similarity Maps
    Imre, Martin
    Tao, Jun
    Wang, Chaoli
    2017 IEEE PACIFIC VISUALIZATION SYMPOSIUM (PACIFICVIS), 2017, : 180 - 184
  • [7] Efficient GPU-accelerated Join Optimization for Complex Queries
    Mageirakos, Vasilis
    Mancini, Riccardo
    Karthik, Srinivas
    Chandra, Bikash
    Ailamaki, Anastasia
    2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, : 3190 - 3193
  • [8] Efficient GPU-accelerated parallel cross-correlation
    Madera, Karel
    Smelko, Adam
    Krulis, Martin
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2025, 199
  • [9] Efficient OLAP algorithms on GPU-accelerated Hadoop clusters
    Hongzhi Wang
    Zheng Wang
    Ning Li
    Xinxin Kong
    Distributed and Parallel Databases, 2019, 37 : 507 - 542
  • [10] Efficient OLAP algorithms on GPU-accelerated Hadoop clusters
    Wang, Hongzhi
    Wang, Zheng
    Li, Ning
    Kong, Xinxin
    DISTRIBUTED AND PARALLEL DATABASES, 2019, 37 (04) : 507 - 542