Hybrid Memory Buffer Microarchitecture for High-Radix Routers

被引:2
|
作者
Li, Cunlu [1 ,2 ]
Dong, Dezun [1 ,2 ]
Liao, Xiangke [1 ,2 ]
Kim, John [3 ]
机构
[1] Natl Univ Def Technol, Coll Comp, Natl Lab Parallel & Distributed Proc, Changsha 410073, Peoples R China
[2] Natl Univ Def Technol, Coll Comp, Collaborat Innovat Ctr High Performance Comp, Changsha 410073, Peoples R China
[3] Korea Adv Inst Sci & Technol KAIST, Sch Elect Engn, Daejeon 34141, South Korea
关键词
Random access memory; Switches; Organizations; Microarchitecture; Ports (computers); Magnetic tunneling; System-on-chip; Hierarchical router; STT-MRAM; high-radix router; ARCHITECTURE; PERFORMANCE; INTERCONNECT; ENERGY; CACHE;
D O I
10.1109/TC.2021.3076431
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Hierarchical high-radix router microarchitecture consisting of small SRAM-based intermediate buffers has been used in large-scale supercomputers interconnection networks. While hierarchical organization enables efficient scaling to higher switch port count, it requires intermediate buffers which can cause performance bottleneck. Shallow intermediate buffers can cause head-of-line blocking to create backpressure towards input buffers and reduce overall performance. Increasing intermediate buffer size overcomes this problem but becomes infeasible due to the large overhead. In this work, we propose to organise decentralized intermediate buffers as a centralized buffer and leverage alternate memory technology to increase its capacity. In particular, we exploit the high-density nature of Spin-Torque Transfer Magnetic RAM (STT-MRAM) to increase intermediate buffer depth while also providing near-zero leakage power. STT-MRAM has disadvantages such as higher write latency and higher write energy. To overcome these disadvantages, we propose DeepHiR, a novel deep hybrid buffer organization (STT-MRAM and SRAM) combined with a centralized buffer organization to provide high performance with minimal cost. Although the deep intermediate buffer provided by DeepHiR can effectively improve router performance, a large amount of input buffer will still cause a lot of hardware overhead. At the same time, deeper intermediate buffers also makes it take longer for the backpressure to propagate to the source node, thereby reducing the performance of DeepHiR. Therefore, we further propose ElasHiR, which leverages elastic input buffer design in the centralized row buffer to allow a part of the centralized row buffer to act as input buffer. ElasHiR adopts reduced input buffers and automatically determines the length of input buffer in the centralized row buffer. This design minimizes the buffer resource while achieving excellent efficiency. Evaluation results show that DeepHiR can achieve 56.7 percent performance improvement in packet latency under synthetic traffic, and the cost of energy and area is moderate. ElasHiR can reduce the input buffer by 93.8 percent with performance comparable to DeepHiR.
引用
收藏
页码:2888 / 2902
页数:15
相关论文
共 50 条
  • [1] DeepHiR : Improving High-radix Router Throughput with Deep Hybrid Memory Buffer Microarchitecture
    Li, Cunlu
    Dong, Dezun
    Liao, Xiang-Ke
    Kim, John
    Kim, Changhyun
    INTERNATIONAL CONFERENCE ON SUPERCOMPUTING (ICS 2019), 2019, : 403 - 413
  • [2] Microarchitecture of a high-radix router
    Kim, J
    Dally, WJ
    Towles, B
    Gupta, AK
    32ND INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, PROCEEDINGS, 2005, : 420 - 431
  • [3] CIB-HIER: Centralized Input Buffer Design in Hierarchical High-radix Routers
    Li, Cunlu
    Dong, Dezun
    Yang, Shazhou
    Liao, Xiangke
    Sun, Guangyu
    Liu, Yongheng
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2021, 18 (04)
  • [4] High-Radix On-chip Networks with Low-Radix Routers
    Jain, Animesh
    Parikh, Ritesh
    Bertacco, Valeria
    2014 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD), 2014, : 289 - 294
  • [5] A FAST AND FAIR SHARED BUFFER FOR HIGH-RADIX ROUTER
    Zhang, Heying
    Wang, Kefei
    Zhang, Jianmin
    Wu, Nan
    Dai, Yi
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2014, 23 (01)
  • [6] Scalable High-Radix Router Microarchitecture Using a Network Switch Organization
    Ahn, Jung Ho
    Son, Young Hoon
    Kim, John
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2013, 10 (03)
  • [7] A Scalable and Resilient Microarchitecture Based on Multiport Binding for High-radix Router Design
    Dai, Yi
    Wang, Kefei
    Qu, Gang
    Xiao, Liquan
    Dong, Dezun
    Qi, Xingyun
    2017 31ST IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2017, : 429 - 438
  • [8] Network within a Network Approach to Create a Scalable High-Radix Router Microarchitecture
    Ahn, Jung Ho
    Choo, Sungwoo
    Kim, John
    2012 IEEE 18TH INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2012, : 455 - 466
  • [9] Improved point-to-point and collective communication performance with output-queued high-radix routers
    Kumar, S
    Stunkel, C
    Kalé, LV
    HIGH PERFORMANCE COMPUTING - HIPC 2005, PROCEEDINGS, 2005, 3769 : 420 - 431
  • [10] High-radix logarithm with selection by rounding
    Piñeiro, JA
    Ercegovac, MD
    Bruguera, JD
    IEEE INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES, AND PROCESSORS, PROCEEDINGS, 2002, : 101 - 110