Monitoring InfiniBand Networks to React Efficiently to Congestion

被引:3
|
作者
Cascajo, Alberto [1 ]
Gomez-Lopez, Gabriel [2 ]
Escudero-Sahuquillo, Jesus [3 ]
Garcia, Pedro Javier [3 ]
Singh, David E. [1 ]
Alfaro-Cortes, Francisco [3 ]
Quiles, Francisco J. [4 ]
Carretero, Jesus [1 ]
机构
[1] Univ Carlos III Madrid, Madrid 28903, Spain
[2] Univ Castilla La Mancha, Adv Informat Technol, Ciudad Real 13001, Spain
[3] Univ Castilla La Mancha UCLM, Comp Architecture & Technol, Ciudad Real 13001, Spain
[4] Univ Castilla La Mancha UCLM, Comp Syst Dept, Ciudad Real 13001, Spain
基金
欧盟地平线“2020”;
关键词
Monitoring; Indexes; Routing; Proposals; Performance evaluation; Network topology; Software; Interconnection networks; cluster; congestion control; traffic monitoring; InfiniBand;
D O I
10.1109/MM.2023.3241840
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Current high-performance interconnection networks for high-performance computing and data-center systems incorporate mechanisms to prevent congestion from degrading network performance. Specifically, the popular InfiniBand specification defines a mechanism to reduce the injection rate of the traffic flows contributing to congestion. However, the efficiency of this mechanism depends on the values configured for certain parameters, that may be suitable for some congestion situations but not for others. Therefore, we think that these parameters should be reconfigured dynamically, based on accurate and updated information about the actual status of congestion. For that purpose, we have combined a light-weight platform monitoring tool (LIMITLESS) with the InfiniBand control software (OpenSM), so that the former provides the latter with enhanced knowledge about congestion to appropriately reconfigure the parameters driving the behavior of the congestion-control mechanism. Experiments performed in a real InfiniBand-based cluster confirm that this approach significantly reduces the number of wrong reactions to the congestion-control mechanism.
引用
收藏
页码:120 / 130
页数:11
相关论文
共 50 条
  • [1] Congestion control in InfiniBand networks
    Gusat, M
    Craddock, D
    Denzel, W
    Engbersen, T
    Ni, N
    Pfister, G
    Rooney, W
    Duato, J
    [J]. HOT INTERCONNECTS 13, 2005, : 158 - 159
  • [2] Improving Congestion Control through Fine-Grain Monitoring of InfiniBand Networks
    Cascajo, Alberto
    Gomez-Lopez, Gabriel
    Escudero-Sahuquillo, Jesus
    Javier Garcia, Pedro
    Singh, David E.
    Alfaro-Cortes, Francisco
    Quiles, Francisco J.
    Carretero, Jesus
    [J]. 2022 IEEE SYMPOSIUM ON HIGH-PERFORMANCE INTERCONNECTS (HOTI), 2022, : 29 - 38
  • [3] Adaptive Multipath Routing for Congestion Control in InfiniBand Networks
    Lugones, D.
    Franco, D.
    Luque, E.
    [J]. 2009 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS (ICPPW 2009), 2009, : 222 - 227
  • [4] LDMS Monitoring of EDR InfiniBand Networks
    Allan, Benjamin A.
    Aguilar, Michael
    Schwaller, Benjamin
    Langer, Steven
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER 2020), 2020, : 459 - 463
  • [5] Feasible enhancements to congestion control in InfiniBand-based networks
    Escudero-Sahuquillo, Jesus
    Garcia, Pedro J.
    Quiles, Francisco J.
    Maglione-Mathey, German
    Duato, Jose
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2018, 112 : 35 - 52
  • [6] The Dynamic Nature of Congestion in InfiniBand
    Liu, Qian
    Russell, Robert D.
    Mizero, Fabrice
    Veeraraghavan, Malathi
    Dennis, John M.
    jamroz, Benjamin
    [J]. 2015 INTERNATIONAL CONFERENCE AND WORKSHOP ON COMPUTING AND COMMUNICATION (IEMCON), 2015,
  • [7] How do airlines react to airport congestion? The role of networks
    Fageda, Xavier
    Flores-Fillol, Ricardo
    [J]. REGIONAL SCIENCE AND URBAN ECONOMICS, 2016, 56 : 73 - 81
  • [8] Improvements to the InfiniBand Congestion Control Mechanism
    Liu, Qian
    Russell, Robert D.
    Gran, Ernst Gunnar
    [J]. 2016 IEEE 24TH ANNUAL SYMPOSIUM ON HIGH-PERFORMANCE INTERCONNECTS (HOTI), 2016, : 27 - 36
  • [9] An enhanced congestion control mechanism in InfiniBand networks for High Performance Computing systems
    Yan, Shihang
    Min, Geyong
    Awan, Irfan
    [J]. 20TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS, VOL 1, PROCEEDINGS, 2006, : 845 - +
  • [10] A Measurement Study of Congestion in an InfiniBand Network
    Alali, Fatma
    Mizero, Fabrice
    Veeraraghavan, Malathi
    Dennis, John M.
    [J]. TMA CONFERENCE 2017 - PROCEEDINGS OF THE 1ST NETWORK TRAFFIC MEASUREMENT AND ANALYSIS CONFERENCE, 2017,