Efficient Software-Implemented HW Fault Tolerance for TinyML Inference in Safety-critical Applications

被引:3
|
作者
Sharif, Uzair [1 ]
Mueller-Gritschneder, Daniel [1 ]
Stahl, Rafael [1 ]
Schlichtmann, Ulf [1 ]
机构
[1] Tech Univ Munich TUM, Chair Elect Design Automat, Munich, Germany
关键词
TinyML; safety; error detection; soft-error;
D O I
10.23919/DATE56975.2023.10137207
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
TinyML research has mainly focused on optimizing neural network inference in terms of latency, code-size and energy-use for efficient execution on low-power micro-controller units (MCUs). However, distinctive design challenges emerge in safety-critical applications, for example in small unmanned autonomous vehicles such as drones, due to the susceptibility of off-the-shelf MCU devices to soft-errors. We propose three new techniques to protect TinyML inference against random soft errors with the target to reduce run-time overhead: one for protecting fully-connected layers; one adaptation of existing algorithmic fault tolerance techniques to depth-wise convolutions; and an efficient technique to protect the so-called epilogues within TinyML layers. Integrating these layer-wise methods, we derive a full-inference hardening solution for TinyML that achieves run-time efficient soft-error resilience. We evaluate our proposed solution on MLPerf-Tiny benchmarks. Our experimental results show that competitive resilience can be achieved compared with currently available methods, while reducing run-time overheads by similar to 120% for one fully-connected neural network (NN); similar to 20% for the two CNNs with depth-wise convolutions; and similar to 2% for standard CNN. Additionally, we propose selective hardening which reduces the incurred run-time overhead further by similar to 2x for the studied CNNs by focusing exclusively on avoiding mispredictions.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] Evaluation of Software-Implemented Fault-Tolerance (SIFT) approach in gracefully degradable multi-computer systems
    Avresky, Dimiter R.
    Geoghegan, Sean J.
    Varoglu, Yavuz
    IEEE TRANSACTIONS ON RELIABILITY, 2006, 55 (03) : 451 - 457
  • [32] Efficient Software Tool Qualification for Automotive Safety-Critical Systems
    Astrom, Alexander
    Izosimov, Viacheslav
    Orsmark, Ola
    ELEKTRONIK IM KRAFTFAHRZEUG: ELEKTRIK, ELEKTRONIK, ELEKTROMOBILITAT, 2011, 2132 : 361 - 370
  • [33] Efficient engineering of safety-critical, software-intensive systems
    Taiber, Joachim
    McGregor, John D.
    2014 INTERNATIONAL CONFERENCE ON CONNECTED VEHICLES AND EXPO (ICCVE), 2014, : 836 - 841
  • [34] Kalman Predictive Redundancy System for Fault Tolerance of Safety-Critical Systems
    Kim, Man Ho
    Lee, Suk
    Lee, Kyung Chang
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2010, 6 (01) : 46 - 53
  • [35] Licensing of software for safety-critical applications on the basis of operating experience
    Ehrenberger, W
    PROBABILISTIC SAFETY ASSESSMENT AND MANAGEMENT, VOL 1- 6, 2004, : 2358 - 2363
  • [36] Software support for incident reporting systems in safety-critical applications
    Johnson, C
    COMPUTER SAFETY, RELIABILITY AND SECURITY, PROCEEDINGS, 2000, 1943 : 96 - 106
  • [37] Achieving Crash Fault Tolerance in Autonomous Vehicle Autopilot Software Stacks Through Safety-Critical Module Rejuvenation
    Lucchetti F.
    Voelp M.
    Ada User Journal, 2023, 44 (02): : 137 - 140
  • [38] BinFI: An Efficient Fault Injector for Safety-Critical Machine Learning Systems
    Chen, Zitao
    Li, Guanpeng
    Pattabiraman, Karthik
    DeBardeleben, Nathan
    PROCEEDINGS OF SC19: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2019,
  • [39] 100% Coverage for Safety-Critical Software - Efficient Testing by Static Analysis
    Kaestner, Daniel
    Heckmann, Reinhold
    Ferdinand, Christian
    COMPUTER SAFETY, RELIABILITY, AND SECURITY, 2010, 6351 : 196 - 209
  • [40] REVIEW OF FAULT-TOLERANT COMPUTING FOR SAFETY-CRITICAL APPLICATIONS IN JAPAN
    TOHMA, Y
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 1994, 9 (01): : 3 - 10