Exploiting Hardware Multicast and GPUDirect RDMA for Efficient Broadcast

被引:9
|
作者
Chu, Ching-Hsiang [1 ]
Lu, Xiaoyi [1 ]
Awan, Ammar A. [1 ]
Subramoni, Hari [1 ]
Elton, Bracy [2 ]
Panda, Dhabaleswar K. [1 ]
机构
[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA
[2] Engility Corp, Dayton, OH 45433 USA
关键词
Broadcast; deep learning; hardware multicast; GPU; GPUDirect RDMA; heterogeneous broadcast; streaming;
D O I
10.1109/TPDS.2018.2867222
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Broadcast is a widely used operation in many streaming and deep learning applications to disseminate large amounts of data on emerging heterogeneous High-Performance Computing (HPC) systems. However, traditional broadcast schemes do not fully utilize hardware features for Graphics Processing Unit (GPU)-based applications. In this paper, a model-oriented analysis is presented to identify performance bottlenecks of existing broadcast schemes on GPU clusters. Next, streaming-based broadcast schemes are proposed to exploit InfiniBand hardware multicast (IB-MCAST) and NVIDIA GPUDirect technology for efficient message transmission. The proposed designs are evaluated in the context of using Message Passing Interface (MPI) based benchmarks and applications. The experimental results indicate improved scalability and up to 82 percent reduction of latency compared to the state-of-the-art solutions in the benchmark-level evaluation. Furthermore, compared to the state-of-the-art, the proposed design yields stable higher throughput for a synthetic streaming workload, and 1.3x faster training time for a deep learning framework.
引用
收藏
页码:575 / 588
页数:14
相关论文
共 50 条
  • [31] The multimedia broadcast/multicast service
    Xylomenos, George
    Vogkas, Vasilis
    Thanos, George
    [J]. WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2008, 8 (02): : 255 - 265
  • [32] Application-layer multicast in MANETs: To broadcast or not to broadcast?
    Baumung, Peter
    [J]. 2008 FIFTH ANNUAL CONFERENCE ON WIRELESS ON DEMAND NETWORK SYSTEMS AND SERVICES, 2008, : 133 - 140
  • [33] Multicast versus broadcast in a MANET
    Kunz, T
    [J]. AD-HOC, MOBILE, AND WIRELESS NETWORKS, PROCEEDINGS, 2004, 3158 : 14 - 27
  • [34] Memory Management Techniques for Exploiting RDMA in PGAS Languages
    Dalton, Barnaby
    Tanase, Gabriel
    Alvanos, Michail
    Almasi, Gheorghe
    Tiotto, Ettore
    [J]. LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING (LCPC 2014), 2015, 8967 : 193 - 207
  • [35] MENPS: A Decentralized Distributed Shared Memory Exploiting RDMA
    Endo, Wataru
    Sato, Shigeyuki
    Taura, Kenjiro
    [J]. PROCEEDINGS OF FOURTH ANNUAL WORKSHOP ON EMERGING PARALLEL AND DISTRIBUTED RUNTIME SYSTEMS AND MIDDLEWARE (IPDRM 2020), 2020, : 9 - 16
  • [36] A Distributed Persistent Memory File System Based on RDMA Multicast
    Chen M.
    Zheng S.
    You L.
    Wang J.
    Yan T.
    Tu Y.
    Han Y.
    Huang L.
    [J]. Zheng, Sheng'an (venero@tsinghua.edu.cn), 1600, Science Press (58): : 384 - 396
  • [37] A Power-Efficient FFT Hardware Architecture Exploiting Approximate Adders
    Ferreira, Guilherme
    Pereira, Pedro T. L.
    Paim, Guilherme
    Costa, Eduardo
    Bampi, Sergio
    [J]. 2021 IEEE 12TH LATIN AMERICA SYMPOSIUM ON CIRCUITS AND SYSTEM (LASCAS), 2021,
  • [38] A hardware efficient LDPC encoding scheme for exploiting decoder structure and resources
    Yoon, Chanho
    Oh, Jong-Ee
    Cheong, Minho
    Lee, Sok-Kyu
    [J]. 2007 IEEE 65TH VEHICULAR TECHNOLOGY CONFERENCE, VOLS 1-6, 2007, : 2445 - 2449
  • [39] MC-RDMA: Improving Replication Performance of RDMA-based Distributed Systems with Reliable Multicast Support
    Huang, Chengyuan
    Gao, Yixiao
    Chen, Wei
    Li, Duoxing
    Xiao, Yibo
    Zhang, Ruyi
    Tian, Chen
    Wang, Xiaoliang
    Dou, Wanchun
    Chen, Guihai
    Wang, Yi
    Xiao, Fu
    [J]. 2023 IEEE 31ST INTERNATIONAL CONFERENCE ON NETWORK PROTOCOLS, ICNP, 2023,
  • [40] Exploiting Randomness in Sketching for Efficient Hardware Implementation of Machine Learning Applications
    Wang, Ye
    Caramanis, Constantine
    Orshansky, Michael
    [J]. 2016 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD), 2016,