A Case for Soft Error Detection and Correction in Computational Chemistry

被引:7
|
作者
van Dam, Hubertus J. J. [1 ]
Vishnu, Abhinav [1 ]
de Jong, Wibe A. [1 ]
机构
[1] Pacific NW Natl Lab, Richland, WA 99354 USA
关键词
D O I
10.1021/ct400489c
中图分类号
O64 [物理化学(理论化学)、化学物理学];
学科分类号
070304 ; 081704 ;
摘要
High performance computing platforms are expected to deliver 1018 floating operations per second by the year 2022 through the deployment of millions of cores. Even if every core is highly reliable the sheer number of them will mean that the mean time between failures will become so short that most application runs will suffer at least one fault. In particular soft errors caused by intermittent incorrect behavior of the hardware are a concern as they lead to silent data corruption. In this paper we investigate the impact of soft errors on optimization algorithms using Hartree-Fock as a particular example. Optimization algorithms iteratively reduce the error in the initial guess to reach the intended solution. Therefore they may intuitively appear to be resilient to soft errors. Our results show that this is true for soft errors of small magnitudes but not for large errors. We suggest error detection and correction mechanisms for different classes of data structures. The results obtained with these mechanisms indicate that we can correct more than 95% of the soft errors at moderate increases in the computational cost.
引用
收藏
页码:3995 / 4005
页数:11
相关论文
共 50 条
  • [1] Exploiting Replicated Checkpoints for Soft Error Detection and Correction
    Koc, Fahrettin
    Bozdas, Kenan
    Karsli, Burak
    Ergin, Oguz
    DESIGN, AUTOMATION & TEST IN EUROPE, 2013, : 1494 - 1497
  • [2] SoftCorrect: Error Correction with Soft Detection for Automatic Speech Recognition
    Leng, Yichong
    Tan, Xu
    Liu, Wenjie
    Song, Kaitao
    Wang, Rui
    Li, Xiang-Yang
    Qin, Tao
    Lin, Ed
    Liu, Tie-Yan
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 11, 2023, : 13034 - 13042
  • [3] The nonredundant error correction differential detection in soft-decision decoding
    Guo, DS
    Yang, XG
    Gan, ZM
    Zhang, BN
    2000 INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY PROCEEDINGS, VOLS. I & II, 2000, : 321 - 326
  • [4] Generic Soft-Error Detection and Correction for Concurrent Data Structures
    Borchert, Christoph
    Schirmeier, Horst
    Spinczyk, Olaf
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2017, 14 (01) : 22 - 36
  • [5] Soft Error Detection and Correction Architecture for Asynchronous Bundled Data Designs
    Kuentzer, Felipe A.
    Krstic, Milos
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2020, 67 (12) : 4883 - 4894
  • [6] A Novel Soft Error Detection and Correction Circuit for Embedded Reconfigurable Systems
    Zhao, Qian
    Ichinomiya, Yoshihiro
    Amagasaki, Motoki
    Iida, Masahiro
    Sueyoshi, Toshinori
    IEEE EMBEDDED SYSTEMS LETTERS, 2011, 3 (03) : 89 - 92
  • [7] Matching Detection and Correction Schemes for Soft Error Handling in Sequential Logic
    Koser, Erol
    Miller, Felix
    Stechele, Walter
    2015 EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD), 2015, : 706 - 713
  • [8] Enhanced architectures for soft error detection and correction in combinational and sequential circuits
    Krstic, Milos
    Weidling, Stefan
    Petrovic, Vladimir
    Sogomonyan, Egor S.
    MICROELECTRONICS RELIABILITY, 2016, 56 : 212 - 220
  • [9] ERROR-DETECTION AND CORRECTION IN MMS - A STUDY CASE
    RIZZO, A
    STABLUM, F
    BAGNARA, S
    ERGONOMICS INTERNATIONAL 88, 1988, : 675 - 677
  • [10] Error Assessment of Computational Models in Chemistry
    Simm, Gregor N.
    Proppe, Jonny
    Reiher, Markus
    CHIMIA, 2017, 71 (04) : 202 - 208