SparseFT: Sparsity-aware Fault Tolerance for Reliable CNN Inference on GPUs

被引：0

作者：

Byeon, Gwangeun ^{[1
]}

Lee, Seungtae ^{[2
]}

Kim, Seongwook ^{[1
]}

Kim, Yongjun ^{[3
]}

Nair, Prashant J. ^{[4
]}

Hong, Seokin ^{[1
]}

机构：

[1] Sungkyunkwan Univ, Dept Elect & Comp Engn, Seoul, South Korea

[2] Sungkyunkwan Univ, Dept AI Syst Engn, Seoul, South Korea

[3] Samsung Elect, Seoul, South Korea

[4] Univ British Columbia, Vancouver, BC, Canada

来源：

2023 32ND INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, PACT | 2023年

基金：

新加坡国家研究基金会;

关键词：

Reliability; CNN; GPU; Sparsity;

D O I：

10.1109/PACT58117.2023.00041

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Graphics Processing Units (GPUs), while offering exceptional performance for CNN inference tasks, are susceptible to both transient and permanent hardware faults due to the integration of numerous processing elements and advancements in technology scaling. This paper proposes a novel and cost-effective fault mitigation technique, called Sparsity-aware Fault Tolerance (SparseFT), to ensure reliable CNN inference on GPUs. SparseFT leverages inherent sparsity in the activation maps to detect and correct errors on the processing elements without hardware redundancy. By exploiting the characteristic of dot-products, where multiplications with zero operands are ineffectual, SparseFT dynamically duplicates an effectual computation (i.e., a multiplication with non-zero operands) to the processing element initially assigned to the ineffectual one. It then compares the duplicated computation results to detect errors. Experimental results demonstrate that SparseFT achieves more than 97% error detection coverage with less than 1% performance overhead for the state-of-the-art CNN models.

引用

页码：337 / 338

页数：2

共 26 条

[1] SaberLDA: Sparsity-Aware Learning of Topic Models on GPUs
Li, Kaiwei
Chen, Jianfei
Chen, Wenguang
Zhu, Jun
[J]. ACM SIGPLAN NOTICES, 2017, 52 (04) : 497 - 509
[2] SaberLDA: Sparsity-Aware Learning of Topic Models on GPUs
Li, Kaiwei
Chen, Jianfei
Chen, Wenguang
Zhu, Jun
[J]. TWENTY-SECOND INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS (ASPLOS XXII), 2017, : 497 - 509
[3] SaberLDA: Sparsity-Aware Learning of Topic Models on GPUs
Li, Kaiwei
Chen, Jianfei
Chen, Wenguang
Zhu, Jun
[J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2020, 31 (09) : 2112 - 2124
[4] SaberLDA: Sparsity-Aware Learning of Topic Models on GPUs
Li, Kaiwei
Chen, Jianfei
Chen, Wenguang
Zhu, Jun
[J]. OPERATING SYSTEMS REVIEW, 2017, 51 (02) : 497 - 509
[5] AccelTran: A Sparsity-Aware Accelerator for Dynamic Inference with Transformers
Tuli S.
Jha N.K.
[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2023, 42 (11) : 4038 - 4051
[6] Exploiting Activation Sparsity for Fast CNN Inference on Mobile GPUs
Oh, Chanyoung
So, Junhyuk
Kim, Sumin
Yi, Youngmin
[J]. ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2021, 20 (05)
[7] A Sparsity-Aware Fault Diagnosis Framework Focusing on Accurate Isolation
Xiu, Xianchao
Miao, Zhonghua
Liu, Wanquan
[J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2023, 19 (02) : 1356 - 1365
[8] SPRING: A Sparsity-Aware Reduced-Precision Monolithic 3D CNN Accelerator Architecture for Training and Inference
Yu, Ye
Jha, Niraj K.
[J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2022, 10 (01) : 237 - 249
[9] SAVE: Sparsity-Aware Vector Engine for Accelerating DNN Training and Inference on CPUs
Gong, Zhangxiaowen
Ji, Houxiang
Fletcher, Christopher W.
Hughes, Christopher J.
Baghsorkhi, Sara
Torrellas, Josep
[J]. 2020 53RD ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO 2020), 2020, : 796 - 810
[10] Sparsity-Aware Tight Frame Learning for Rotary Machine Fault Diagnosis
Zhang, Han
Chen, Xuefeng
Du, Zhaohui
Ma, Meng
Zhang, Xiaoli
[J]. 2016 IEEE INTERNATIONAL INSTRUMENTATION AND MEASUREMENT TECHNOLOGY CONFERENCE PROCEEDINGS, 2016, : 819 - 824

← 1 2 3 →