Optimizing MPI One Sided Communication on Multi-core InfiniBand Clusters Using Shared Memory Backed Windows

被引：0

作者：

Potluri, Sreeram ^{[1
]}

Wang, Hao ^{[1
]}

Dhanraj, Vijay ^{[1
]}

Sur, Sayantan ^{[1
]}

Panda, Dhabaleswar K. ^{[1
]}

机构：

[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA

来源：

RECENT ADVANCES IN THE MESSAGE PASSING INTERFACE | 2011年 / 6960卷

关键词：

MPI; shared memory; one-sided communication;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

The Message Passing Interface (MPI) has been very popular for programming parallel scientific applications. As the multi-core architectures have become prevalent, a major question that has emerged is about the use of MPI within a compute node and its impact on communication costs. The one-sided communication interface in MPI provides a mechanism to reduce communication costs by removing matching requirements of the send/receive model. The MPI standard provides the flexibility to allocate memory windows backed by shared memory. However, state-of-the-art open-source MPI libraries do not leverage this optimization opportunity for commodity clusters. In this paper, we present a design and implementation of intra-node MPI one-sided interface using shared memory backed windows on multi-core clusters. We use MVAPICH2 MPI library for design, implementation and evaluation. Micro-benchmark evaluation shows that the new design can bring up to 85% improvement in Put. Get and Accumulate latencies, with passive synchronization mode. The bandwidth performance of Put and Get improves by 64% and 42%, respectively. Splash LU benchmark shows an improvement of up to 55% with the new design on 32 core Magny-cours node. It shows similar improvement on a 12 core Westmere node. The mean BFS time in Graph500 reduces by 39% and 77% on Magny-cours and Westmere nodes, respectively.

引用

页码：99 / 109

页数：11

共 17 条

[1] Natively Supporting True One-sided Communication in MPI on Multi-core Systems with InfiniBand
Santhanaraman, G.
Balaji, P.
Gopalakrishnan, K.
Thakur, R.
Gropp, W.
Panda, D. K.
[J]. CCGRID: 2009 9TH IEEE INTERNATIONAL SYMPOSIUM ON CLUSTER COMPUTING AND THE GRID, 2009, : 380 - +
[2] Redesigning MPI shared memory communication for large multi-core architecture
Luo, Miao
Wang, Hao
Vienne, Jerome
Panda, Dhabaleswar K.
[J]. COMPUTER SCIENCE-RESEARCH AND DEVELOPMENT, 2013, 28 (2-3): : 137 - 146
[3] Efficient implementation of MPI-2 passive one-sided communication on InfiniBand clusters
Jiang, WH
Liu, JX
Jin, HW
Panda, DK
Buntinas, D
Thakur, R
Gropp, WD
[J]. RECENT ADVANCES IN PARALLEL VIRTUAL MACHINE AND MESSAGE PASSING INTERFACE, PROCEEDINGS, 2004, 3241 : 68 - 76
[4] EXPLOITING DIRECT ACCESS SHARED MEMORY FOR MPI ON MULTI-CORE PROCESSORS
Brightwell, Ron
[J]. INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2010, 24 (01): : 69 - 77
[5] MPI Support for Multi-core Architectures: Optimized Shared Memory Collectives
Graham, Richard L.
Shipman, Galen
[J]. RECENT ADVANCES IN PARALLEL VIRTUAL MACHINE AND MESSAGE PASSING INTERFACE, 2008, 5205 : 130 - 140
[6] Supporting MPI-2 one sided communication on multi-rail InfiniBand clusters: Design challenges and performance benefits
Vishnu, A
Santhanaraman, G
Huang, W
Jin, HW
Panda, DK
[J]. HIGH PERFORMANCE COMPUTING - HIPC 2005, PROCEEDINGS, 2005, 3769 : 137 - 147
[7] Task Assignments based on Shared Memory Multi-core Communication
Xu, Xiaojie
Wang, Lisheng
[J]. 2014 2ND INTERNATIONAL CONFERENCE ON SYSTEMS AND INFORMATICS (ICSAI), 2014, : 324 - 328
[8] Design Alternatives for Implementing Fence Synchronization in MPI-2 One-sided Communication for InfiniBand Clusters
Santhanaraman, G.
Gangadharappa, T.
Narravula, S.
Mamidala, A.
Panda, D. K.
[J]. 2009 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING AND WORKSHOPS, 2009, : 394 - 402
[9] Multiple Virtual Lanes-aware MPI Collective Communication in Multi-core Clusters
Li, Bo
Huo, Zhigang
Zhang, Panyong
Meng, Dan
[J]. 16TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), PROCEEDINGS, 2009, : 304 - 311
[10] Designing truly one-sided MPI-2 RMA intra-node communication on multi-core systems
Lai, Ping
Sur, Sayantan
Panda, Dhabaleswar K.
[J]. COMPUTER SCIENCE-RESEARCH AND DEVELOPMENT, 2010, 25 (1-2): : 3 - 14

← 1 2 →