Exploring Winograd Convolution for Cost-Effective Neural Network Fault Tolerance

被引:2
|
作者
Xue, Xinghua [1 ,2 ]
Liu, Cheng [1 ,2 ]
Liu, Bo [3 ]
Huang, Haitong [1 ,2 ]
Wang, Ying [1 ,2 ]
Luo, Tao [4 ]
Zhang, Lei [1 ,2 ]
Li, Huawei [1 ,2 ]
Li, Xiaowei [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, State Key Lab Processors, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 100190, Peoples R China
[3] Beijing Inst Control Engn, Beijing 100190, Peoples R China
[4] ASTAR, Inst High Performance Comp, Singapore 138632, Singapore
基金
中国国家自然科学基金;
关键词
Fault tolerant systems; Fault tolerance; Artificial neural networks; Convolution; Reliability; Computational modeling; Neurons; Fault-tolerance; soft errors; vulnerability analysis; winograd convolution (WG-Conv);
D O I
10.1109/TVLSI.2023.3306894
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Winograd is generally utilized to optimize convolution performance and computational efficiency because of the reduced multiplication operations, but the reliability issues brought by winograd are usually overlooked. In this work, we observe the great potential of winograd convolution (WG-Conv) in improving neural network (NN) fault tolerance. Based on the observation, we evaluate WG-Conv fault tolerance comprehensively from different granularities ranging from models, layers, and operation types for the first time. Then, we explore the use of inherent fault tolerance of WG-Conv for cost-effective NN protection against soft errors. Specifically, we mainly investigate how WG-Conv can be effectively incorporated with classical fault-tolerant design approaches including triple modular redundancy (TMR), fault-aware retraining, and constrained activation functions. According to our experiments, WG-Conv can reduce the fault-tolerant design overhead by 55.77% on average without any accuracy loss compared to standard convolution (ST-Conv), and further reduce the computing overhead by 17.24% when the inherent fault tolerance of WG-Conv is considered. When it is applied on fault-tolerant NNs enhanced with fault-aware retraining and constrained activation functions, the resulting model accuracy generally shows significant improvement in the presence of various faults.
引用
收藏
页码:1763 / 1773
页数:11
相关论文
共 50 条
  • [1] Winograd Convolution: A Perspective from Fault Tolerance
    Xue, Xinghua
    Huang, Haitong
    Liu, Cheng
    Luo, Tao
    Zhang, Lei
    Wang, Ying
    [J]. PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022, 2022, : 853 - 858
  • [2] Multiplexing schemes for cost-effective fault-tolerance
    Roy, S
    Beiu, V
    [J]. 2004 4TH IEEE CONFERENCE ON NANOTECHNOLOGY, 2004, : 589 - 592
  • [3] CENNA: Cost-Effective Neural Network Accelerator
    Park, Sang-Soo
    Chung, Ki-Seok
    [J]. ELECTRONICS, 2020, 9 (01)
  • [4] COST-EFFECTIVE AND FLEXIBLE SCHEME FOR SOFTWARE FAULT-TOLERANCE
    BONDAVALLI, A
    DIGIANDOMENICO, F
    XU, J
    [J]. COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 1993, 8 (04): : 234 - 244
  • [5] HARDWARE REDUNDANCY - THE WAY TO GO FOR COST-EFFECTIVE FAST FAULT TOLERANCE
    FOSTER, WE
    [J]. ELECTRONIC DESIGN, 1984, 32 (14) : 62 - 62
  • [6] Joint Throughput and Fault Tolerance Requirement for Cost-Effective Dense WiFi
    Qiu, Shuwei
    Leung, Yiu-Wing
    [J]. 2024 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC 2024, 2024,
  • [7] A methodology for cost-effective software fault tolerance for mission-critical systems
    Kreutzfeld, RJ
    Neese, RE
    [J]. 15TH DASC - AIAA/IEEE DIGITAL AVIONICS SYSTEMS CONFERENCE, 1996, : 19 - 24
  • [8] Cost-effective multichip module manufacture using passive substrate fault tolerance
    Peacock, C
    Bolouri, H
    Habiger, C
    [J]. IEEE TRANSACTIONS ON COMPONENTS PACKAGING AND MANUFACTURING TECHNOLOGY PART B-ADVANCED PACKAGING, 1997, 20 (03): : 320 - 326
  • [9] Enhancing Sensor Fault Tolerance in Automotive Systems With Cost-Effective Cyber Redundancy
    Foshati, Amin
    Ejlali, Alireza
    [J]. IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (04): : 4794 - 4803
  • [10] Methodology for cost-effective software fault tolerance for mission-critical systems
    TASC, Fairborne, United States
    [J]. IEEE Aerosp Electron Syst Mag, 1600, 9 (25-30):