Efficient Software-Implemented HW Fault Tolerance for TinyML Inference in Safety-critical Applications

被引:3
|
作者
Sharif, Uzair [1 ]
Mueller-Gritschneder, Daniel [1 ]
Stahl, Rafael [1 ]
Schlichtmann, Ulf [1 ]
机构
[1] Tech Univ Munich TUM, Chair Elect Design Automat, Munich, Germany
关键词
TinyML; safety; error detection; soft-error;
D O I
10.23919/DATE56975.2023.10137207
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
TinyML research has mainly focused on optimizing neural network inference in terms of latency, code-size and energy-use for efficient execution on low-power micro-controller units (MCUs). However, distinctive design challenges emerge in safety-critical applications, for example in small unmanned autonomous vehicles such as drones, due to the susceptibility of off-the-shelf MCU devices to soft-errors. We propose three new techniques to protect TinyML inference against random soft errors with the target to reduce run-time overhead: one for protecting fully-connected layers; one adaptation of existing algorithmic fault tolerance techniques to depth-wise convolutions; and an efficient technique to protect the so-called epilogues within TinyML layers. Integrating these layer-wise methods, we derive a full-inference hardening solution for TinyML that achieves run-time efficient soft-error resilience. We evaluate our proposed solution on MLPerf-Tiny benchmarks. Our experimental results show that competitive resilience can be achieved compared with currently available methods, while reducing run-time overheads by similar to 120% for one fully-connected neural network (NN); similar to 20% for the two CNNs with depth-wise convolutions; and similar to 2% for standard CNN. Additionally, we propose selective hardening which reduces the incurred run-time overhead further by similar to 2x for the studied CNNs by focusing exclusively on avoiding mispredictions.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] A new approach to software-implemented fault tolerance
    Rebaudengo, M
    Reorda, MS
    Violante, M
    JOURNAL OF ELECTRONIC TESTING-THEORY AND APPLICATIONS, 2004, 20 (04): : 433 - 437
  • [2] A New Approach to Software-Implemented Fault Tolerance
    M. Rebaudengo
    M. Sonza Reorda
    M. Violante
    Journal of Electronic Testing, 2004, 20 : 433 - 437
  • [3] A Controller Safety Concept Based on Software-Implemented Fault Tolerance for Fail-Operational Automotive Applications
    Ghadhab, Majdi
    Kuntz, Matthias
    Kuvaiskii, Dmitrii
    Fetzer, Christof
    FORMAL TECHNIQUES FOR SAFETY-CRITICAL SYSTEMS, (FTSCS 2015), 2016, 596 : 189 - 205
  • [4] The recovery language approach for software-implemented fault tolerance
    De Florio, V
    Deconinck, C
    Lauwereins, R
    NINTH EUROMICRO WORKSHOP ON PARALLEL AND DISTRIBUTED PROCESSING, PROCEEDINGS, 2001, : 418 - 425
  • [5] A Spatial-Temporal Model for Software Fault Tolerance in Safety-Critical Applications
    Zhang, Tao
    Wang, Jinbo
    2017 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY COMPANION (QRS-C), 2017, : 575 - 576
  • [6] THE SOFTWARE-IMPLEMENTED FAULT TOLERANCE (SIFT) APPROACH TO FAULT TOLERANT COMPUTING
    GOLDBERG, J
    PROCEEDINGS OF THE SOCIETY OF PHOTO-OPTICAL INSTRUMENTATION ENGINEERS, 1981, 298 : 289 - 293
  • [7] A PERFORMANCE EVALUATION OF THE SOFTWARE-IMPLEMENTED FAULT-TOLERANCE COMPUTER
    PALUMBO, DL
    BUTLER, RW
    JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 1986, 9 (02) : 175 - 180
  • [8] RTOS Eyes Fault Tolerance and Safety-Critical Applications
    Wong, William
    Electronic Design, 2003, 51 (24)
  • [9] A software-implemented fault injection methodology for design and validation of system fault tolerance
    Some, RR
    Kim, WS
    Khanoyan, G
    Callum, L
    Agrawal, A
    Beahan, JJ
    INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, PROCEEDINGS, 2001, : 501 - 506
  • [10] Fault-tolerance capabilities of a software-implemented Hopfield Neural Network
    Mansour, Wassim
    Velazco, Raoul
    Ayoubi, Rafic
    El Falou, Wassim
    Ziade, Haissam
    2013 THIRD INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND INFORMATION TECHNOLOGY (ICCIT), 2013, : 205 - 208