IMR: High-Performance Low-Cost Multi-Ring NoCs

被引:11
|
作者
Liu, Shaoli [1 ]
Chen, Tianshi [1 ,2 ]
Li, Ling [3 ]
Feng, Xiaoxue [1 ]
Xu, Zhiwei [1 ]
Chen, Haibo [4 ]
Chong, Fred [5 ]
Chen, Yunji [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, State Key Lab Comp Architecture, Beijing, Peoples R China
[2] CAS Ctr Excellence Brain Sci & Intelligence Techn, Beijing, Peoples R China
[3] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China
[4] Shanghai Jiao Tong Univ, Inst Parallel & Distributed Syst, Shanghai 200240, Peoples R China
[5] Univ Calif Santa Barbara, Dept Comp Sci, Santa Barbara, CA 93117 USA
关键词
Network on Chip; Topology; Multi-Ring; NETWORK; RING; ROUTER; SINGLE;
D O I
10.1109/TPDS.2015.2465905
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
A ring topology is a common solution of network-on-chip (NoC) in industry, but is frequently criticized to have poor scalability. In this paper, we present a novel type of multi-ring NoC called isolated multi-ring (IMR), which can even support chip multi-processors (CMPs) with 1,024 cores. In IMR, any pair of cores are connected via at least one isolated ring, so that each packet can reach the destination without transferring from one ring to another. Therefore, IMR no longer needs expensive routers as mesh, which not only enhances the network performance but also reduces hardware overheads. We utilize simulated evolution to design optimized IMR topologies. We compare these IMR topologies against nine representative NoCs (e.g., traditional mesh, multi mesh, low-cost mesh, Express-virtual-channels mesh (EVC), torus ring, and hierarchical ring). We observe from experiments that IMR significantly outperforms its competitors in both saturation throughput and latency across all scenarios considered. For example, in a 16 x 16 CMP, IMR improves the saturation throughput of a state-of-the-art mesh (EVC) by 265.29 percent on average, and reduces the average packet latency on SPLASH-2 application traces by 71.58 percent, while consuming 5.08 percent less area and 9.76 percent less power. In a 32 x 32 CMP, IMR averagely improves the saturation throughput of EVC by 191.58 percent, and averagely reduces the packet latency on SPLASH-2 application traces by 23.09 percent, while consuming 2.86 percent less area and 10.81 percent less power.
引用
收藏
页码:1700 / 1712
页数:13
相关论文
共 50 条