Leveraging Non-Blocking Collective Communication in High-Performance Applications

被引:0
|
作者
Hoefler, Torsten [1 ]
Gottschling, Peter [1 ]
Lumsdain, Andrew [1 ]
机构
[1] Indiana Univ, Open Syst Lab, Bloomington, IN 47404 USA
关键词
non-blocking collectives; overlap; MPI;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Although overlapping communication with computation is an important mechanism for achieving high performance in parallel programs, developing, applications that actually achieve good overlap can be difficult. Existing approaches are typically based on manual or compiler-based transformations. This paper presents a pattern and library-based approach to optimizing collective communication in parallel high-performance applications, based on using non-blocking collective operations to enable overlapping of communication and computation. Common communication and computation patterns in iterative SPMD computations are used to motivate the transformations we present. Our approach provides the programmer with the capability to separately optimize communication and computation in an application, while automating the interaction between computation and communication to achieve maximum overlap. Performance results with a model application show more than a 90% decrease in communication overhead, resulting in 21% overall performance improvements.
引用
收藏
页码:113 / 115
页数:3
相关论文
共 50 条
  • [1] A High Performance Asynchronous Non-blocking Data Communication Protocol
    Huang, Guimin
    Zheng, Zhi
    Zhou, Ya
    PROCEEDINGS OF 2016 IEEE 7TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS 2016), 2016, : 269 - 272
  • [2] Performance Simulation of Non-blocking Communication in Message-Passing Applications
    Boehme, David
    Hermanns, Marc-Andre
    Geimer, Markus
    Wolf, Felix
    EURO-PAR 2009 PARALLEL PROCESSING WORKSHOPS, 2010, 6043 : 208 - 217
  • [3] Auto-tuning Non-blocking Collective Communication Operations
    Barigou, Youcef
    Venkatesan, Vishwanath
    Gabriel, Edgar
    2015 IEEE 29TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, 2015, : 1204 - 1213
  • [4] High-performance low-cost non-blocking switch for ATM
    Lin, JF
    Wang, SD
    IEEE INFOCOM '96 - FIFTEENTH ANNUAL JOINT CONFERENCE OF THE IEEE COMPUTER AND COMMUNICATIONS SOCIETIES: NETWORKING THE NEXT GENERATION, PROCEEDINGS VOLS 1-3, 1996, : 818 - 821
  • [5] Implementation and Performance Analysis of Non-Blocking Collective Operations for MPI
    Hoefler, Torsten
    Lumsdaine, Andrew
    Rehm, Wolfgang
    2007 ACM/IEEE SC07 CONFERENCE, 2010, : 127 - +
  • [6] A case for non-blocking collective operations
    Hoefler, Torsten
    Squyres, Jeffrey M.
    Rehm, Wolfgang
    Lumsdaine, Andrew
    FRONTIERS OF HIGH PERFORMANCE COMPUTING AND NETWORKING - ISPA 2006 WORKSHOPS, PROCEEDINGS, 2006, 4331 : 155 - +
  • [7] Optimizing non-blocking collective operations for InfiniBand
    Hoefler, Torsten
    Lumsdaine, Andrew
    2008 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-8, 2008, : 182 - +
  • [8] A case for standard non-blocking collective operations
    Hoefler, Torsten
    Kambadur, Prabhanjan
    Graham, Richard L.
    Shipman, Galen
    Lumsdaine, Andrew
    RECENT ADVANCES IN PARALLEL VIRTUAL MACHINE AND MESSAGE PASSING INTERFACE, 2007, 4757 : 125 - +
  • [9] NB-Cache: Non-Blocking In-Network Caching for High-Performance Content Routers
    Pan, Tian
    Lin, Xingchen
    Song, Enge
    Xu, Cheng
    Zhang, Jiao
    Li, Hao
    Lv, Jianhui
    Huang, Tao
    Liu, Bin
    Zhang, Beichuan
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2021, 29 (05) : 1976 - 1989
  • [10] High-performance and scalable non-blocking all-to-all with collective offload on InfiniBand clusters: a study with parallel 3D FFT
    Kandalla, Krishna
    Subramoni, Hari
    Tomko, Karen
    Pekurovsky, Dmitry
    Sur, Sayantan
    Panda, Dhabaleswar K.
    COMPUTER SCIENCE-RESEARCH AND DEVELOPMENT, 2011, 26 (3-4): : 237 - 246