Optimizing non-blocking collective operations for InfiniBand

被引:0
|
作者
Hoefler, Torsten [1 ]
Lumsdaine, Andrew [1 ]
机构
[1] Indiana Univ, Open Syst Lab, Bloomington, IN 47405 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Non-blocking collective operations have recently been shown to be a promising complementary approach for overlapping communication and computation in parallel applications. However, in order to maximize the performance and usability of these operations it is important that they progress concurrently with the application without introducing CPU overhead and without requiring explicit user intervention. While studying non-blocking collective operations in the context of our portable library (libNBC), we found that most MPI implementations do not sufficienctly support overlap over the InfiniBand network. To address this issue, we developed a low-level communication layer for libNBC based on the Open Fabrics InfiniBand verbs API. With this layer we are able to achieve high degrees of overlap without the need to explicitly progress the communication operations. We show that the communication overhead of parallel application kernels can be reduced up to 92% while not requiring user intervention to make progress.
引用
收藏
页码:182 / +
页数:2
相关论文
共 50 条
  • [1] Optimizing a conjugate gradient solver with non-blocking collective operations
    Hoefler, Torsten
    Gottschling, Peter
    Lumsdaine, Andrew
    Rehm, Wolfgang
    PARALLEL COMPUTING, 2007, 33 (09) : 624 - 633
  • [2] Optimizing a conjugate gradient solver with non-blocking collective operations
    Hoefler, Torsten
    Gottschling, Peter
    Rehm, Wolfgang
    Lumsdaine, Andrew
    RECENT ADVANCES IN PARALLEL VIRTUAL MACHINE AND MESSAGE PASSING INTERFACE, 2006, 4192 : 374 - 382
  • [3] A case for non-blocking collective operations
    Hoefler, Torsten
    Squyres, Jeffrey M.
    Rehm, Wolfgang
    Lumsdaine, Andrew
    FRONTIERS OF HIGH PERFORMANCE COMPUTING AND NETWORKING - ISPA 2006 WORKSHOPS, PROCEEDINGS, 2006, 4331 : 155 - +
  • [4] A case for standard non-blocking collective operations
    Hoefler, Torsten
    Kambadur, Prabhanjan
    Graham, Richard L.
    Shipman, Galen
    Lumsdaine, Andrew
    RECENT ADVANCES IN PARALLEL VIRTUAL MACHINE AND MESSAGE PASSING INTERFACE, 2007, 4757 : 125 - +
  • [5] Auto-tuning Non-blocking Collective Communication Operations
    Barigou, Youcef
    Venkatesan, Vishwanath
    Gabriel, Edgar
    2015 IEEE 29TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, 2015, : 1204 - 1213
  • [6] Implementation and Performance Analysis of Non-Blocking Collective Operations for MPI
    Hoefler, Torsten
    Lumsdaine, Andrew
    Rehm, Wolfgang
    2007 ACM/IEEE SC07 CONFERENCE, 2010, : 127 - +
  • [7] Designing Non-blocking Allreduce with Collective Offload on InfiniBand Clusters: A Case Study with Conjugate Gradient Solvers
    Kandalla, K.
    Yang, U.
    Keasler, J.
    Kolev, T.
    Moody, A.
    Subramoni, H.
    Tomko, K.
    Vienne, J.
    de Supinski, B. R.
    Panda, D. K.
    2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2012, : 1156 - 1167
  • [8] Progression of MPI non-blocking collective operations using Hyper-Threading
    Miwa, Masahiro
    Nakashima, Kohta
    23RD EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2015), 2015, : 163 - 171
  • [9] Non-blocking Patricia Tries with Replace Operations
    Shafiei, Niloufar
    2013 IEEE 33RD INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2013, : 216 - 225
  • [10] Why Non-blocking Operations Should be Selfish
    Gibson, Joel
    Gramoli, Vincent
    DISTRIBUTED COMPUTING (DISC 2015), 2015, 9363 : 200 - 214