Ensuring Deadlock-Freedom in Low-Diameter InfiniBand Networks

被引:0
|
作者
Schneider, Timo [1 ]
Bibartiu, Otto [1 ]
Hoefler, Torsten [1 ]
机构
[1] Swiss Fed Inst Technol, Dept Comp Sci, Zurich, Switzerland
关键词
TABLES;
D O I
10.1109/HOTI.2016.11
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Lossless networks, such as InfiniBand use flow-control to avoid packet-loss due to congestion. This introduces dependencies between input and output channels, in case of cyclic dependencies the network can deadlock. Deadlocks can be resolved by splitting a physical channel into multiple virtual channels with independent buffers and credit systems. Currently available routing engines for InfiniBand assign entire paths from source to destination nodes to different virtual channels. However, InfiniBand allows changing the virtual channel at every switch. We developed fast routing engines which make use of that fact and map individual hops to virtual channels. Our algorithm imposes a total order on virtual channels and increments the virtual channel at every hop, thus the diameter of the network is an upper bound for the required number of virtual channels. We integrated this algorithm into the InfiniBand software stack. Our algorithms provide deadlock free routing on state-of-theart low-diameter topologies, using fewer virtual channels than currently available practical approaches, while being faster by a factor of four on large networks. Since low-diameter topologies are common among the largest supercomputers in the world, to provide deadlock-free routing for such systems is very important.
引用
收藏
页码:1 / 8
页数:8
相关论文
共 50 条
  • [31] Ant Mill: an adversarial traffic pattern for low-diameter direct networks
    Camarero, Cristobal
    Martinez, Carmen
    Beivide, Ramon
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (12): : 18062 - 18080
  • [32] Deadlock-Free Layered Routing for Infiniband Networks
    Kawano, Ryuta
    Matsutani, Hiroki
    Amano, Hideharu
    2019 SEVENTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING WORKSHOPS (CANDARW 2019), 2019, : 84 - 90
  • [33] On straightening low-diameter unit trees
    Poon, SH
    GRAPH DRAWING, 2006, 3843 : 519 - 521
  • [34] Parsimonious formulations for low-diameter clusters
    Salemi, Hosseinali
    Buchanan, Austin
    MATHEMATICAL PROGRAMMING COMPUTATION, 2020, 12 (03) : 493 - 528
  • [35] A Dynamic Sufficient Condition of Deadlock-Freedom for High-Performance Fault-Tolerant Routing in Networks-on-Chips
    Charif, Amir
    Coelho, Alexandre
    Zergainoh, Nacer-Eddine
    Nicolaidis, Michael
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2020, 8 (03) : 642 - 654
  • [36] LOW-DIAMETER GRAPH DECOMPOSITION IS IN NC
    AWERBUCH, B
    BERGER, B
    COWEN, L
    PELEG, D
    RANDOM STRUCTURES & ALGORITHMS, 1994, 5 (03) : 441 - 452
  • [37] A Simple Deterministic Distributed Low-Diameter Clustering
    Rozhoň, Václav
    Haeupler, Bernhard
    Grunau, Christoph
    arXiv, 2022,
  • [38] A Simple Deterministic Distributed Low-Diameter Clustering
    Rozhon, Vaclav
    Haeupler, Bernhard
    Grunau, Christoph
    2023 SYMPOSIUM ON SIMPLICITY IN ALGORITHMS, SOSA, 2023, : 166 - 174
  • [39] Designing Energy-Efficient Low-Diameter On-chip Networks with Equalized Interconnects
    Joshi, Ajay
    Kim, Byungsub
    Stojanovic, Vladimir
    2009 17TH IEEE SYMPOSIUM ON HIGH-PERFORMANCE INTERCONNECTS (HOTI 2009), 2009, : 3 - 12
  • [40] A polynomial-time checkable sufficient condition for deadlock-freedom of component-based systems
    Majster-Cederbaum, Mila
    Martens, Moritz
    Minnameier, Christoph
    SOFSEM 2007: THEORY AND PRACTICE OF COMPUTER SCIENCE, PROCEEDINGS, 2007, 4362 : 888 - +