共 50 条
Addressing a New Class of Reliability Threats in 3-D Network-on-Chips
被引:12
|作者:
Taheri, Ebadollah
[1
]
Isakov, Mihailo
[1
]
Patooghy, Ahmad
[2
]
Kinsy, Michel A.
[1
]
机构:
[1] Boston Univ, Dept Elect & Comp Engn, Adapt & Secure Comp Syst Lab, Boston, MA 02215 USA
[2] Univ Cent Arkansas, Dept Comp Sci, Conway, AR 72035 USA
来源:
关键词:
Three-dimensional displays;
Routing;
Reliability;
Circuit faults;
Computer network reliability;
Through-silicon vias;
Fabrication;
3-D integrated circuit;
network-on-chip (NoC);
reliability;
router architecture;
virtual channel;
FAULT-TOLERANT;
ROUTING ALGORITHM;
MESH;
D O I:
10.1109/TCAD.2019.2917846
中图分类号:
TP3 [计算技术、计算机技术];
学科分类号:
0812 ;
摘要:
Network-on-chips (NoCs) are vulnerable to transient and permanent faults caused by thermal violations, aging effects, component wear out, or even transient fault sources. Although some of these faults are addressed by previous research, we show that there are reliability threats in 3-D NoCs that go beyond the reliability issues investigated in 2-D interconnect networks. First, we highlight one such class of reliability threats and discuss their manifestations in 3-D NoCs. Second, we propose a thermal, reliability, and performance-aware routing algorithm to tackle: 1) previously established fault models and 2) the new highlighted class of reliability threats in partially connected 3-D NoCs. The proposed routing algorithm takes into account the states of routers and both the horizontal and through silicon via (TSV) links, along with the temperatures of routers and cores. It then routes the packets around failed or overheated links and routers, achieving lower latencies by avoiding misrouting. To achieve this, the proposed routing algorithm uses the concept of vertical link announcement to inform nodes in the network of the working condition of vertical links. We evaluate the proposed routing algorithm under a wide range of working conditions using the access Noxim NoC simulator. Results show that the proposed routing algorithm: 1) is able to tolerate almost any number and pattern of vertical link failures; 2) is reliable against the newly identified reliability threats; and 3) improves the latency and temperature distribution of the network compared to previously proposed routing algorithms.
引用
收藏
页码:1358 / 1371
页数:14
相关论文