Unified reliability estimation and management of NoC based chip multiprocessors

被引:20
|
作者
Yamamoto, Alexandre Yasuo [1 ]
Ababei, Cristinel [2 ]
机构
[1] N Dakota State Univ, Dept Elect & Comp Engn, Fargo, ND 58105 USA
[2] SUNY Buffalo, Dept Elect Engn, Buffalo, NY 14260 USA
基金
美国国家科学基金会;
关键词
Reliability; Mean time to failure; Network-on-chip; Chip multiprocessor; THERMAL MANAGEMENT;
D O I
10.1016/j.micpro.2013.11.009
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We present a new architecture level unified reliability evaluation methodology for chip multiprocessors (CMPs). The proposed reliability estimation (REST) is based on a Monte Carlo algorithm. What distinguishes REST from the previous work is that both the computational and communication components are considered in a unified manner to compute the reliability of the CMP. We utilize REST tool to develop a new dynamic reliability management (DRM) scheme to address time-dependent dielectric breakdown and negative-bias temperature instability aging mechanisms in network-on-chip (NoC) based CMPs. Designed as a control loop, the proposed DRM scheme uses an effective neural network based reliability estimation module. The neural-network predictor is trained using the REST tool. We investigate how system's lifetime changes when the NoC as the communication unit of the CMP is considered or not during the reliability evaluation process and find that differences can be as high as 60%. Full-system based simulations using a customized GEM5 simulator show that reliability can be improved by up to 52% using the proposed DRM scheme in a best-effort scenario with 2-9% performance penalty (using a user set target lifetime of 7 years) over the case when no DRM is employed. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:53 / 63
页数:11
相关论文
共 50 条
  • [1] Dynamic Energy and Reliability Management in Network-on-Chip based Chip Multiprocessors
    Moghaddam, Milad Ghorbani
    [J]. 2017 EIGHTH INTERNATIONAL GREEN AND SUSTAINABLE COMPUTING CONFERENCE (IGSC), 2017,
  • [2] Investigation of DVFS Based Dynamic Reliability Management for Chip Multiprocessors
    Moghaddam, Milad Ghorbani
    Yamamoto, Alexandre
    Ababei, Cristinel
    [J]. PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS 2015), 2015, : 563 - 568
  • [3] Dynamic Lifetime Reliability Management for Chip Multiprocessors
    Moghaddam, Milad Ghorbani
    Ababei, Cristinel
    [J]. IEEE TRANSACTIONS ON MULTI-SCALE COMPUTING SYSTEMS, 2018, 4 (04): : 952 - 958
  • [4] Processor Allocation Problem for NoC-based Chip Multiprocessors
    Zydek, Dawid
    Selvaraj, Henry
    [J]. PROCEEDINGS OF THE 2009 SIXTH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: NEW GENERATIONS, VOLS 1-3, 2009, : 96 - 101
  • [5] Photonic NoC for DMA communications in chip multiprocessors
    Shacham, Assaf
    Lee, Benjamin G.
    Biberman, Aleksandr
    Bergman, Keren
    Carloni, Luca P.
    [J]. 15TH ANNUAL IEEE SYMPOSIUM ON HIGH-PERFORMANCE INTERCONNECTS, PROCEEDINGS, 2007, : 29 - +
  • [6] Compiler-Directed Application Mapping for NoC Based Chip Multiprocessors
    Chen, Guangyu
    Li, Feihui
    Kandemir, Mahmut
    [J]. LCTES'07: PROCEEDINGS OF THE 2007 ACM SIGPLAN-SIGBED CONFERENCE ON LANGUAGES, COMPILERS, AND TOOLS FOR EMBEDDED SYSTEMS, 2007, : 155 - 157
  • [7] Compiler-directed application mapping for NoC based chip multiprocessors
    Chen, Guangyu
    Li, Feihui
    Kandemir, Mahmut
    [J]. ACM SIGPLAN NOTICES, 2007, 42 (07) : 155 - 157
  • [8] NoC-Aware Cache Design for Chip Multiprocessors
    Abousamra, Ahmed K.
    Melhem, Rami G.
    Jones, Alex K.
    [J]. PACT 2010: PROCEEDINGS OF THE NINETEENTH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, 2010, : 565 - 566
  • [9] An Application-Aware Heterogeneous Prioritization Framework for NoC based Chip Multiprocessors
    Pimpalkhute, Tejasi
    Pasricha, Sudeep
    [J]. PROCEEDINGS OF THE FIFTEENTH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED 2014), 2015, : 76 - 83
  • [10] NoC-Based Fault-Tolerant Cache Design in Chip Multiprocessors
    Banaiyanmofrad, Abbas
    Girao, Gustavo
    Dutt, Nikil
    [J]. ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2014, 13