Efficient and scalable All-to-All Personalized Exchange for InfiniBand-based clusters

被引:0
|
作者
Sur, S [1 ]
Jin, HW [1 ]
Panda, DK [1 ]
机构
[1] Ohio State Univ, Columbus, OH 43210 USA
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The All-to-All Personalized Exchange is the most dense collective communication function offered by the MPI specification. The operation involves every process sending a different message to all other participating processes. This collective operation is essential for many parallel scientific applications. With increasing system and message sizes, it becomes challenging to offer a fast, scalable and efficient implementation of this operation. InfiniBand is an emerging modem interconnect. It offers very low latency, high bandwidth and one-sided operations like RDMA write. Its advanced features like RDMA write gather allow us to design and implement All-to-all algorithms much more efficiently than in the past. Our aim in this paper is to design efficient and scalable implementations of traditional personalized exchange algorithms. In this paper we present two novel approaches towards designing All-to-all algorithms for short and long messages respectively. The Hypercube RDMA Write Gather and Direct Eager schemes effectively leverage the RDMA and RDMA with Write gather mechanisms offered by InfiniBand. Performance evaluation of our design and implementation reveals that it is able to reduce the All-to-All communication time by upto a factor of 3.07 for 32 byte messages on a 16 node InfiniBand cluster Our analytical models suggest that the proposed designs will perform 64% better on InfiniBand clusters with 1024 nodes for 4k message size.
引用
收藏
页码:275 / 282
页数:8
相关论文
共 50 条
  • [1] High performance RDMA based all-to-all broadcast for InfiniBand clusters
    Sur, S
    Bondhugula, UKR
    Mamidala, A
    Jin, HW
    Panda, DK
    [J]. HIGH PERFORMANCE COMPUTING - HIPC 2005, PROCEEDINGS, 2005, 3769 : 148 - 157
  • [2] Efficient all-to-all personalized exchange in multidimensional torus networks
    Suh, YJ
    Shin, KG
    [J]. 1998 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING - PROCEEDINGS, 1998, : 468 - 475
  • [3] An empirical approach for efficient all-to-all personalized communication on Ethernet switched clusters
    Faraj, A
    Yuan, X
    [J]. 2005 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSSING, PROCEEDINGS, 2005, : 321 - 328
  • [4] Optimal all-to-all personalized exchange in multistage networks
    Yang, YY
    Wang, JC
    [J]. SEVENTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, PROCEEDINGS, 2000, : 229 - 236
  • [5] All-to-all personalized exchange in generalized shuffle-exchange networks
    Chou, Well Y.
    Chen, Chiuyuan
    [J]. THEORETICAL COMPUTER SCIENCE, 2010, 411 (16-18) : 1669 - 1684
  • [6] Bandwidth Efficient All-to-All Broadcast on Switched Clusters
    Ahmad Faraj
    Pitch Patarasuk
    Xin Yuan
    [J]. International Journal of Parallel Programming, 2008, 36 : 426 - 453
  • [7] Bandwidth efficient all-to-all broadcast on switched clusters
    Faraj, Ahmad
    Patarasuk, Pitch
    Yuan, Xin
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2006, : 153 - +
  • [8] Bandwidth efficient all-to-all broadcast on switched clusters
    Faraj, Ahmad
    Patarasuk, Pitch
    Yuan, Xin
    [J]. INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2008, 36 (04) : 426 - 453
  • [9] All-to-All Personalized Exchange Algorithms in Generalized Shuffle-exchange Networks
    Chou, Well Y.
    Chen, Richard B.
    Chen, Chiuyuan
    [J]. 2009 EIGHTH INTERNATIONAL CONFERENCE ON NETWORKS, 2009, : 185 - 190
  • [10] Fast and scalable barrier using RDMA and multicast mechanisms for InfiniBand-based clusters
    Kini, SP
    Liu, JX
    Wu, JS
    Wyckoff, P
    Panda, DK
    [J]. RECENT ADVANCES IN PARALLEL VIRTUAL MACHINE AND MESSAGE PASSING INTERFACE, 2003, 2840 : 369 - 378