An efficient cloud-based elastic RDMA protocol for HPC applications

被引:0
|
作者
Hang Cao
Cheng Xu
Yunqi Han
Muhui Lin
Kai Shen
Geng Wang
Jinhu Li
Xiangzheng Sun
Ronghui He
Liang You
Hang Yang
Xiantao Zhang
机构
[1] Alibaba Group,
关键词
RDMA; HPC applications; Cloud computing; Elastic networking;
D O I
暂无
中图分类号
学科分类号
摘要
High-performance computing (HPC) networking is of great importance in scaling many HPC applications across multiple nodes. Generally, most HPC applications deployed on traditional supercomputers or clusters adopt RDMA protocols such as InfiniBand for inter-node networking to mitigate high latency during constant communication. As cloud-based HPC continues to emerge as a significant trend, utilizing RDMA in the cloud has become a challenging problem. To address this problem, We propose an efficient elastic RDMA Protocol (eRDMA) to enabling RDMA’s merits for HPC applications in the cloud. eRDMA applys the direct data movement (DDM) of cloud infrastructure processing Unit (CIPU), overlay of virtual private cloud (VPC), and compatibility for RDMA verbs to fully utilize the elastic resources with the features of RDMA network for HPC scenarios in the cloud. The effectiveness of eRDMA is demonstrated by various experimental results across different platforms for many HPC and general TCP applications.
引用
下载
收藏
页码:45 / 53
页数:8
相关论文
共 50 条
  • [1] An efficient cloud-based elastic RDMA protocol for HPC applications
    Cao, Hang
    Xu, Cheng
    Han, Yunqi
    Lin, Muhui
    Shen, Kai
    Wang, Geng
    Li, Jinhu
    Sun, Xiangzheng
    He, Ronghui
    You, Liang
    Yang, Hang
    Zhang, Xiantao
    CCF TRANSACTIONS ON HIGH PERFORMANCE COMPUTING, 2024, 6 (01) : 45 - 53
  • [2] Cloud-based HPC
    Geller, Tom
    COMMUNICATIONS OF THE ACM, 2012, 55 (03) : 21 - 21
  • [3] Towards Cloud-based Asynchronous Elasticity for Iterative HPC Applications
    Righi, Rodrigo da Rosa
    Rodrigues, Vinicius Facco
    da Costa, Cristiano Andre
    Kreutz, Diego
    Heiss, Hans-Ulrich
    XV BRAZILIAN SYMPOSIUM ON HIGH PERFORMANCE COMPUTATIONAL SYSTEMS (WSCAD 2014), 2015, 649
  • [4] Elastic and Efficient Virtual Network Provisioning for Cloud-Based Multi-Tier Applications
    Shen, Meng
    Xu, Ke
    Li, Fan
    Yang, Kun
    Zhu, Liehuang
    Guan, Lei
    2015 44TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2015, : 929 - 938
  • [5] Architectures for Cloud-based HPC in Data Centers
    Dao Manh Phan Hung
    Naidu, Sunil Manyam Seshadri
    Agyeman, Michael Opoku
    2017 IEEE 2ND INTERNATIONAL CONFERENCE ON BIG DATA ANALYSIS (ICBDA), 2017, : 143 - 148
  • [6] Efficient Federated Learning for Cloud-Based AIoT Applications
    Zhang, Xinqian
    Hu, Ming
    Xia, Jun
    Wei, Tongquan
    Chen, Mingsong
    Hu, Shiyan
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2021, 40 (11) : 2211 - 2223
  • [7] A novel protocol for efficient authentication in cloud-based IoT devices
    Alam, Irfan
    Kumar, Manoj
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (10) : 13823 - 13843
  • [8] A Performance Comparison of HPC Workloads on Traditional and Cloud-based HPC Clusters
    Munhoz, Vanderlei
    Bonfils, Antoine
    Castro, Marcio
    Mendizabal, Odorico
    2023 INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING WORKSHOPS, SBAC-PADW, 2023, : 108 - 114
  • [9] A novel protocol for efficient authentication in cloud-based IoT devices
    Irfan Alam
    Manoj Kumar
    Multimedia Tools and Applications, 2022, 81 : 13823 - 13843
  • [10] Swift-X: Accelerating OpenStack Swift with RDMA for Building an Efficient HPC Cloud
    Gugnani, Shashank
    Lu, Xiaoyi
    Panda, Dhabaleswar K.
    2017 17TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2017, : 238 - 247