Efficient Intranode Communication in GPU-Accelerated Systems

被引：1

作者：

Ji, Feng ^{[1
]}

Aji, Ashwin M. ^{[2
]}

Dinan, James ^{[3
]}

Buntinas, Darius ^{[3
]}

Balaji, Pavan ^{[3
]}

Feng, Wu-chun ^{[2
]}

Ma, Xiaosong ^{[1
,4
]}

机构：

[1] North Carolina State Univ, Dept Comp Sci, Raleigh, NC 27695 USA

[2] Virginia Tech, Dept Comp Sci, Blacksburg, VA USA

[3] Argonne Natl Lab, Math & Comp Sci Div, Argonne, IL 60439 USA

[4] Oak Ridge Natl Lab, Div Math & Comp Sci, Oak Ridge, TN USA

来源：

2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS & PHD FORUM (IPDPSW) | 2012年

基金：

美国国家科学基金会;

关键词：

IMPLEMENTATION;

D O I：

10.1109/IPDPSW.2012.227

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Current implementations of MPI are unaware of accelerator memory (i.e., GPU device memory) and require programmers to explicitly move data between memory spaces. This approach is inefficient, especially for intranode communication where it can result in several extra copy operations. In this work, we integrate GPU-awareness into a popular MPI runtime system and develop techniques to significantly reduce the cost of intranode communication involving one or more GPUs. Experiment results show an up to 2x increase in bandwidth, resulting in an average of 4.3% improvement to the total execution time of a halo exchange benchmark.

引用

页码：1838 / 1847

页数：10

共 50 条

[1] DMA-Assisted, Intranode Communication in GPU Accelerated Systems
Ji, Feng
Aji, Ashwin M.
Dinan, James
Buntinas, Darius
Balaji, Pavan
Thakur, Rajeev
Feng, Wu-Chun
Ma, Xiaosong
2012 IEEE 14TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2012 IEEE 9TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (HPCC-ICESS), 2012, : 461 - 468
[2] Efficient MPI-based Communication for GPU-Accelerated Dask Applications
Shafi, Aamir
Hashmi, Jahanzeb Maqbool
Subramoni, Hari
Panda, Dhabaleswar K.
21ST IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2021), 2021, : 277 - 286
[3] Cooperative multitasking for GPU-accelerated grid systems
Ino, Fumihiko
Ogita, Akihiro
Oita, Kentaro
Hagihara, Kenichi
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2012, 24 (01): : 96 - 107
[4] Resource Sharing in GPU-accelerated Windowing Systems
Kato, Shinpei
Lakshmanan, Karthik
Ishikawa, Yutaka
Rajkumar, Ragunathan
17TH IEEE REAL-TIME AND EMBEDDED TECHNOLOGY AND APPLICATIONS SYMPOSIUM (RTAS 2011), 2011, : 191 - 200
[5] GPU-Accelerated Microdosimetry
Decunha, J.
Mohan, R.
MEDICAL PHYSICS, 2022, 49 (06) : E467 - E468
[6] Efficient GPU-Accelerated Computation of Isosurface Similarity Maps
Imre, Martin
Tao, Jun
Wang, Chaoli
2017 IEEE PACIFIC VISUALIZATION SYMPOSIUM (PACIFICVIS), 2017, : 180 - 184
[7] Efficient GPU-accelerated Join Optimization for Complex Queries
Mageirakos, Vasilis
Mancini, Riccardo
Karthik, Srinivas
Chandra, Bikash
Ailamaki, Anastasia
2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, : 3190 - 3193
[8] Efficient GPU-accelerated parallel cross-correlation
Madera, Karel
Smelko, Adam
Krulis, Martin
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2025, 199
[9] Efficient OLAP algorithms on GPU-accelerated Hadoop clusters
Hongzhi Wang
Zheng Wang
Ning Li
Xinxin Kong
Distributed and Parallel Databases, 2019, 37 : 507 - 542
[10] Efficient OLAP algorithms on GPU-accelerated Hadoop clusters
Wang, Hongzhi
Wang, Zheng
Li, Ning
Kong, Xinxin
DISTRIBUTED AND PARALLEL DATABASES, 2019, 37 (04) : 507 - 542

← 1 2 3 4 5 →