Leveraging Non-Blocking Collective Communication in High-Performance Applications

被引:0
|
作者
Hoefler, Torsten [1 ]
Gottschling, Peter [1 ]
Lumsdain, Andrew [1 ]
机构
[1] Indiana Univ, Open Syst Lab, Bloomington, IN 47404 USA
关键词
non-blocking collectives; overlap; MPI;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Although overlapping communication with computation is an important mechanism for achieving high performance in parallel programs, developing, applications that actually achieve good overlap can be difficult. Existing approaches are typically based on manual or compiler-based transformations. This paper presents a pattern and library-based approach to optimizing collective communication in parallel high-performance applications, based on using non-blocking collective operations to enable overlapping of communication and computation. Common communication and computation patterns in iterative SPMD computations are used to motivate the transformations we present. Our approach provides the programmer with the capability to separately optimize communication and computation in an application, while automating the interaction between computation and communication to achieve maximum overlap. Performance results with a model application show more than a 90% decrease in communication overhead, resulting in 21% overall performance improvements.
引用
收藏
页码:113 / 115
页数:3
相关论文
共 50 条
  • [41] Design of the scheduler for the high-capacity non-blocking packet switch
    Petrovic, Milos
    Smiljanic, Aleksandra
    HPSR: 2006 WORKSHOP ON HIGH PERFORMANCE SWITCHING AND ROUTING, 2006, : 397 - 402
  • [42] CMOS compatible reconfigurable filter for high bandwidth non-blocking operation
    Lira, Hugo L. R.
    Poitras, Carl B.
    Lipson, Michal
    OPTICS EXPRESS, 2011, 19 (21): : 20115 - 20121
  • [43] Parallel domain decomposition method with non-blocking communication for flow through porous media
    Lemmer, Andreas
    Hilfer, Rudolf
    JOURNAL OF COMPUTATIONAL PHYSICS, 2015, 281 : 970 - 981
  • [44] Non-Blocking Synchronization Between Real-Time and Non-Real-Time Applications
    Ruiz, Alejandro Perez
    Rivas, Mario Aldea
    Harbour, Michael Gonzalez
    IEEE ACCESS, 2020, 8 : 147618 - 147634
  • [45] Software Model Checking for Distributed Systems with Selector-Based, Non-blocking Communication
    Artho, Cyrille
    Hagiya, Masami
    Potter, Richard
    Tanabe, Yoshinori
    Weitl, Franz
    Yamamoto, Mitsuharu
    2013 28TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE), 2013, : 169 - 179
  • [46] A High-Performance Collective I/O Framework Leveraging Node-Local Persistent Memory
    Sanchez, Keegan
    Gavin, Alex
    Byna, Suren
    Wu, Kesheng
    Zhang, Xuechen
    EURO-PAR 2024: PARALLEL PROCESSING, PART II, EURO-PAR 2024, 2024, 14802 : 182 - 195
  • [47] A Methodology For Performance Analysis of Non-Blocking Algorithms Using Hardware and Software Metrics
    Izadpanah, Ramin
    Feldman, Steven
    Dechev, Damian
    2016 IEEE 19TH INTERNATIONAL SYMPOSIUM ON REAL-TIME DISTRIBUTED COMPUTING (ISORC 2016), 2016, : 43 - 52
  • [48] High-Performance Compact HF Antenna for Radar and Communication Applications
    Baker, James
    Iskander, Magdy F.
    Youn, Hyoung-Sun
    Celik, Nuri
    2010 IEEE ANTENNAS AND PROPAGATION SOCIETY INTERNATIONAL SYMPOSIUM, 2010,
  • [49] Optimized memory-based messaging: Leveraging the memory system for high-performance communication
    Cheriton, DR
    Kutter, RA
    COMPUTING SYSTEMS, 1996, 9 (03): : 179 - 215
  • [50] Designing Non-blocking Allreduce with Collective Offload on InfiniBand Clusters: A Case Study with Conjugate Gradient Solvers
    Kandalla, K.
    Yang, U.
    Keasler, J.
    Kolev, T.
    Moody, A.
    Subramoni, H.
    Tomko, K.
    Vienne, J.
    de Supinski, B. R.
    Panda, D. K.
    2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2012, : 1156 - 1167