Performance-Aware Reliability Assessment of Heterogeneous Chips

被引:0
|
作者
Chatzidimitriou, Athanasios [1 ]
Kaliorakis, Manolis [1 ]
Tselonis, Sotiris [1 ]
Gizopoulos, Dimitris [1 ]
机构
[1] Univ Athens, Dept Informat & Telecommun, Athens, Greece
关键词
vulnerability evaluation; reliability; performance; fault injection; microarchitectural; simulators; CPU; GPU;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Technology evolution has raised serious reliability considerations, as transistor dimensions shrink and modern microprocessors become denser and more vulnerable to faults. Reliability studies have proposed a plethora of methodologies for assessing system vulnerability which, however, highly rely on traditional reliability metrics that solely express failure rate over time. Although Failures In Time (FIT) is a very strong and representative reliability metric, it may fail to offer an objective comparison of highly diverse systems, such as CPUs against GPUs or other accelerators that are often employed to execute the same algorithms implemented for these platforms. In this paper, we propose a reliability evaluation methodology that takes into account the probability of a workload execution failure in order to compare heterogeneous systems, while we also capture the differences in the performance of these systems. We demonstrate the usefulness of the methodology with a test case scenario that compares the reliability and performance of three different commercial CPUs (different ISAs and microarchitectures) and one GPU. We use statistical fault injection to assess the vulnerability of the register file for the four computing systems of our study. The evaluation was performed using a comprehensive set of benchmarks with the same algorithms implemented for each individual system (serial code for the CPUs and parallel code for the GPU). Our findings show that, even though the GPU proves to be three orders of magnitude more vulnerable than CPUs using traditional reliability metrics, our performance-aware evaluation methodology shrinks this gap by 1-2 orders of magnitude providing more informative and realistic measurements to guide designers or programmers decisions.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] Performance-Aware Based Correlated Datasets Replication Strategy
    Ye, Lin
    Luan, Zhongzhi
    Yang, Hailong
    [J]. TRUSTWORTHY COMPUTING AND SERVICES (ISCTCS 2014), 2015, 520 : 322 - 327
  • [32] A Network Performance-Aware Routing for Multisite Virtual Clusters
    Ichikawa, Kohei
    Date, Susumu
    Abe, Hirotake
    Yamanaka, Hiroaki
    Kawai, Eiji
    Shimojo, Shinji
    [J]. 2013 19TH IEEE INTERNATIONAL CONFERENCE ON NETWORKS (ICON), 2013,
  • [33] An Efficient and Performance-Aware Big Data Storage System
    Li, Yang
    Guo, Li
    Guo, Yike
    [J]. CLOUD COMPUTING AND SERVICES SCIENCE, CLOSER 2012, 2013, 367 : 102 - 116
  • [34] Performance-Aware Thermal Management via Task Scheduling
    Zhou, Xiuyi
    Yang, Jun
    Chrobak, Marek
    Zhang, Youtao
    [J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2010, 7 (01)
  • [35] Method of network slicing deployment based on performance-aware
    Huang, Kaizhi
    Pan, Qirun
    Yuan, Quan
    You, Wei
    Tang, Hongbo
    [J]. Tongxin Xuebao/Journal on Communications, 2019, 40 (08): : 114 - 122
  • [36] Towards Performance-Aware Engineering of Autonomic Component Ensembles
    Bures, Tomas
    Horky, Vojtech
    Kit, Michal
    Marek, Lukas
    Tuma, Petr
    [J]. LEVERAGING APPLICATIONS OF FORMAL METHODS, VERIFICATION AND VALIDATION: TECHNOLOGIES FOR MASTERING CHANGE, PT I, 2014, 8802 : 131 - 146
  • [37] SPO: A Secure and Performance-aware Optimization for MapReduce Scheduling
    Maleki, Neda
    Rahmani, Amir Masoud
    Conti, Mauro
    [J]. JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2021, 176
  • [38] Performance-aware Security of Unicast Communication in Hybrid Satellite Networks
    Roy-Chowdhury, Ayan
    Baras, John S.
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, VOLS 1-8, 2009, : 797 - 802
  • [39] Performance-Aware Approximation of Global Channel Pruning for Multitask CNNs
    Ye, Hancheng
    Zhang, Bo
    Chen, Tao
    Fan, Jiayuan
    Wang, Bin
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (08) : 10267 - 10284
  • [40] Performance-Aware Big Data Management for Remote Sensing Systems
    Pekturk, Mustafa Kemal
    Unal, Muhammet
    Gokcen, Hadi
    [J]. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2024, 49 (03) : 3845 - 3865