Detecting Silent Data Corruptions in Aerospace-Based Computing Using Program Invariants

被引:5
|
作者
Ma, Junchi [1 ,2 ]
Yu, Dengyun [3 ]
Wang, Yun [1 ,2 ]
Cai, Zhenbo [3 ]
Zhang, Qingxiang [3 ]
Hu, Cheng [1 ,2 ]
机构
[1] Southeast Univ, Sch Comp Sci & Engn, Nanjing 211189, Jiangsu, Peoples R China
[2] Minist Educ, Key Lab Comp Network & Informat Integrat, Nanjing 211189, Jiangsu, Peoples R China
[3] Beijing Inst Spacecraft Syst Engn, Beijing 100094, Peoples R China
关键词
ERROR-DETECTION;
D O I
10.1155/2016/8213638
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
Soft error caused by single event upset has been a severe challenge to aerospace-based computing. Silent data corruption (SDC) is one of the results incurred by soft error. SDC occurs when a program generates erroneous output with no indications. SDC is the most insidious type of results and very difficult to detect. To address this problem, we design and implement an invariant-based system called Radish. Invariants describe certain properties of a program; for example, the value of a variable equals a constant. Radish first extracts invariants at key program points and converts invariants into assertions. It then hardens the program by inserting the assertions into the source code. When a soft error occurs, assertions will be found to be false at run time and warn the users of soft error. To increase the coverage of SDC, we further propose an extension of Radish, named Radish D, which applies software-based instruction duplication mechanism to protect the uncovered code sections. Experiments using architectural fault injections show that Radish achieves high SDC coverage with very low overhead. Furthermore, Radish D provides higher SDC coverage than that of either Radish or pure instruction duplication.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] F_Radish: Enhancing Silent Data Corruption Detection for Aerospace-Based Computing
    Yang, Na
    Wang, Yun
    ELECTRONICS, 2021, 10 (01) : 1 - 20
  • [2] Silent Data Corruptions in Computing: Understand and Quantify
    Macieira, Thiago
    Gurumurthy, Sankar
    Gurumurthi, Sudhanva
    Haggag, Amr
    Papadimitriou, George
    Gizopoulos, Dimitris
    2024 IEEE 30TH INTERNATIONAL SYMPOSIUM ON ON-LINE TESTING AND ROBUST SYSTEM DESIGN, IOLTS 2024, 2024,
  • [3] Exploring the capabilities of support vector machines in detecting silent data corruptions
    Subasi, Omer
    Di, Sheng
    Bautista-Gomez, Leonardo
    Balaprakash, Prasanna
    Unsal, Osman
    Labarta, Jesus
    Cristal, Adrian
    Krishnamoorthy, Sriram
    Cappello, Franck
    SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMS, 2018, 19 : 277 - 290
  • [4] Detecting Functional Dependence Program Invariants Based on Data Mining
    Liu Shukun
    Yang Xiaohua
    2009 ASIA-PACIFIC CONFERENCE ON INFORMATION PROCESSING (APCIP 2009), VOL 1, PROCEEDINGS, 2009, : 332 - +
  • [5] Mitigating Silent Data Corruptions in HPC Applications across Multiple Program Inputs
    Huang, Yafan
    Guo, Shengjian
    Di, Sheng
    Li, Guanpeng
    Cappello, Franck
    SC22: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2022,
  • [6] Silent Data Corruptions in Computing Systems: Early Predictions and Large-Scale Measurements
    Gizopoulos, Dimitris
    Papadimitriou, George
    Chatzopoulos, Odysseas
    Karystinos, Nikos
    Dixit, Harish D.
    Sankar, Sriram
    IEEE EUROPEAN TEST SYMPOSIUM, ETS 2024, 2024,
  • [7] Low-cost Program-level Detectors for Reducing Silent Data Corruptions
    Hari, Siva Kumar Sastry
    Adve, Sarita V.
    Naeimi, Helia
    2012 42ND ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS (DSN), 2012,
  • [8] Mitigating Silent Data Corruptions in Integer Matrix Products: Toward Reliable Multimedia Computing on Unreliable Hardware
    Anarado, Ijeoma
    Anam, Mohammad Ashraful
    Verdicchio, Fabio
    Andreopoulos, Yiannis
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2017, 27 (11) : 2476 - 2489
  • [9] Using instruction result locality and re-execution to mitigate silent data corruptions
    Tajary, Alireza
    Zarandi, Hamid R.
    MICROELECTRONICS RELIABILITY, 2016, 62 : 178 - 190
  • [10] On the Detection of Silent Data Corruptions in HPC Applications Using Redundant Multi-threading
    Perez, Diego
    Ropars, Thomas
    Meneses, Esteban
    EURO-PAR 2020: PARALLEL PROCESSING WORKSHOPS, 2021, 12480 : 290 - 302