Optimizing non-blocking collective operations for InfiniBand

被引：0

作者：

Hoefler, Torsten ^{[1
]}

Lumsdaine, Andrew ^{[1
]}

机构：

[1] Indiana Univ, Open Syst Lab, Bloomington, IN 47405 USA

来源：

2008 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-8 | 2008年

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Non-blocking collective operations have recently been shown to be a promising complementary approach for overlapping communication and computation in parallel applications. However, in order to maximize the performance and usability of these operations it is important that they progress concurrently with the application without introducing CPU overhead and without requiring explicit user intervention. While studying non-blocking collective operations in the context of our portable library (libNBC), we found that most MPI implementations do not sufficienctly support overlap over the InfiniBand network. To address this issue, we developed a low-level communication layer for libNBC based on the Open Fabrics InfiniBand verbs API. With this layer we are able to achieve high degrees of overlap without the need to explicitly progress the communication operations. We show that the communication overhead of parallel application kernels can be reduced up to 92% while not requiring user intervention to make progress.

引用

页码：182 / +

页数：2

共 50 条

[1] Optimizing a conjugate gradient solver with non-blocking collective operations
Hoefler, Torsten
Gottschling, Peter
Lumsdaine, Andrew
Rehm, Wolfgang
PARALLEL COMPUTING, 2007, 33 (09) : 624 - 633
[2] Optimizing a conjugate gradient solver with non-blocking collective operations
Hoefler, Torsten
Gottschling, Peter
Rehm, Wolfgang
Lumsdaine, Andrew
RECENT ADVANCES IN PARALLEL VIRTUAL MACHINE AND MESSAGE PASSING INTERFACE, 2006, 4192 : 374 - 382
[3] A case for non-blocking collective operations
Hoefler, Torsten
Squyres, Jeffrey M.
Rehm, Wolfgang
Lumsdaine, Andrew
FRONTIERS OF HIGH PERFORMANCE COMPUTING AND NETWORKING - ISPA 2006 WORKSHOPS, PROCEEDINGS, 2006, 4331 : 155 - +
[4] A case for standard non-blocking collective operations
Hoefler, Torsten
Kambadur, Prabhanjan
Graham, Richard L.
Shipman, Galen
Lumsdaine, Andrew
RECENT ADVANCES IN PARALLEL VIRTUAL MACHINE AND MESSAGE PASSING INTERFACE, 2007, 4757 : 125 - +
[5] Auto-tuning Non-blocking Collective Communication Operations
Barigou, Youcef
Venkatesan, Vishwanath
Gabriel, Edgar
2015 IEEE 29TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, 2015, : 1204 - 1213
[6] Implementation and Performance Analysis of Non-Blocking Collective Operations for MPI
Hoefler, Torsten
Lumsdaine, Andrew
Rehm, Wolfgang
2007 ACM/IEEE SC07 CONFERENCE, 2010, : 127 - +
[7] Designing Non-blocking Allreduce with Collective Offload on InfiniBand Clusters: A Case Study with Conjugate Gradient Solvers
Kandalla, K.
Yang, U.
Keasler, J.
Kolev, T.
Moody, A.
Subramoni, H.
Tomko, K.
Vienne, J.
de Supinski, B. R.
Panda, D. K.
2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2012, : 1156 - 1167
[8] Progression of MPI non-blocking collective operations using Hyper-Threading
Miwa, Masahiro
Nakashima, Kohta
23RD EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2015), 2015, : 163 - 171
[9] Non-blocking Patricia Tries with Replace Operations
Shafiei, Niloufar
2013 IEEE 33RD INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2013, : 216 - 225
[10] Why Non-blocking Operations Should be Selfish
Gibson, Joel
Gramoli, Vincent
DISTRIBUTED COMPUTING (DISC 2015), 2015, 9363 : 200 - 214

← 1 2 3 4 5 →