SparseFT: Sparsity-aware Fault Tolerance for Reliable CNN Inference on GPUs

被引:0
|
作者
Byeon, Gwangeun [1 ]
Lee, Seungtae [2 ]
Kim, Seongwook [1 ]
Kim, Yongjun [3 ]
Nair, Prashant J. [4 ]
Hong, Seokin [1 ]
机构
[1] Sungkyunkwan Univ, Dept Elect & Comp Engn, Seoul, South Korea
[2] Sungkyunkwan Univ, Dept AI Syst Engn, Seoul, South Korea
[3] Samsung Elect, Seoul, South Korea
[4] Univ British Columbia, Vancouver, BC, Canada
基金
新加坡国家研究基金会;
关键词
Reliability; CNN; GPU; Sparsity;
D O I
10.1109/PACT58117.2023.00041
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Graphics Processing Units (GPUs), while offering exceptional performance for CNN inference tasks, are susceptible to both transient and permanent hardware faults due to the integration of numerous processing elements and advancements in technology scaling. This paper proposes a novel and cost-effective fault mitigation technique, called Sparsity-aware Fault Tolerance (SparseFT), to ensure reliable CNN inference on GPUs. SparseFT leverages inherent sparsity in the activation maps to detect and correct errors on the processing elements without hardware redundancy. By exploiting the characteristic of dot-products, where multiplications with zero operands are ineffectual, SparseFT dynamically duplicates an effectual computation (i.e., a multiplication with non-zero operands) to the processing element initially assigned to the ineffectual one. It then compares the duplicated computation results to detect errors. Experimental results demonstrate that SparseFT achieves more than 97% error detection coverage with less than 1% performance overhead for the state-of-the-art CNN models.
引用
收藏
页码:337 / 338
页数:2
相关论文
共 26 条
  • [1] SaberLDA: Sparsity-Aware Learning of Topic Models on GPUs
    Li, Kaiwei
    Chen, Jianfei
    Chen, Wenguang
    Zhu, Jun
    [J]. ACM SIGPLAN NOTICES, 2017, 52 (04) : 497 - 509
  • [2] SaberLDA: Sparsity-Aware Learning of Topic Models on GPUs
    Li, Kaiwei
    Chen, Jianfei
    Chen, Wenguang
    Zhu, Jun
    [J]. TWENTY-SECOND INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS (ASPLOS XXII), 2017, : 497 - 509
  • [3] SaberLDA: Sparsity-Aware Learning of Topic Models on GPUs
    Li, Kaiwei
    Chen, Jianfei
    Chen, Wenguang
    Zhu, Jun
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2020, 31 (09) : 2112 - 2124
  • [4] SaberLDA: Sparsity-Aware Learning of Topic Models on GPUs
    Li, Kaiwei
    Chen, Jianfei
    Chen, Wenguang
    Zhu, Jun
    [J]. OPERATING SYSTEMS REVIEW, 2017, 51 (02) : 497 - 509
  • [5] AccelTran: A Sparsity-Aware Accelerator for Dynamic Inference with Transformers
    Tuli S.
    Jha N.K.
    [J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2023, 42 (11) : 4038 - 4051
  • [6] Exploiting Activation Sparsity for Fast CNN Inference on Mobile GPUs
    Oh, Chanyoung
    So, Junhyuk
    Kim, Sumin
    Yi, Youngmin
    [J]. ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2021, 20 (05)
  • [7] A Sparsity-Aware Fault Diagnosis Framework Focusing on Accurate Isolation
    Xiu, Xianchao
    Miao, Zhonghua
    Liu, Wanquan
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2023, 19 (02) : 1356 - 1365
  • [8] SPRING: A Sparsity-Aware Reduced-Precision Monolithic 3D CNN Accelerator Architecture for Training and Inference
    Yu, Ye
    Jha, Niraj K.
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2022, 10 (01) : 237 - 249
  • [9] SAVE: Sparsity-Aware Vector Engine for Accelerating DNN Training and Inference on CPUs
    Gong, Zhangxiaowen
    Ji, Houxiang
    Fletcher, Christopher W.
    Hughes, Christopher J.
    Baghsorkhi, Sara
    Torrellas, Josep
    [J]. 2020 53RD ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO 2020), 2020, : 796 - 810
  • [10] Sparsity-Aware Tight Frame Learning for Rotary Machine Fault Diagnosis
    Zhang, Han
    Chen, Xuefeng
    Du, Zhaohui
    Ma, Meng
    Zhang, Xiaoli
    [J]. 2016 IEEE INTERNATIONAL INSTRUMENTATION AND MEASUREMENT TECHNOLOGY CONFERENCE PROCEEDINGS, 2016, : 819 - 824