A User-level Library for Fault Tolerance on Shared Memory Multicore Systems

被引:0
|
作者
Mushtaq, Hamid [1 ]
Al-Ars, Zaid [1 ]
Bertels, Koen [1 ]
机构
[1] Delft Univ Technol, Comp Engn Lab, Delft, Netherlands
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The ever decreasing transistor size has made it possible to integrate multiple cores on a single die. On the downside, this has introduced reliability concerns as smaller transistors are more prone to both transient and permanent faults. However, the abundant extra processing resources of a multicore system can be exploited to provide fault tolerance by using redundant execution. We have designed a library for multicore processing, that can make a multithreaded user-level application fault tolerant by simple modifications to the code. It uses the abundant cores found in the system to perform redundant execution for error detection. Besides that, it also allows recovery through checkpoint/rollback. Our library is portable since it does not depend on any special hardware. Furthermore, the overhead (up to 46% for 4 threads), our library adds to the original application, is less than other existing approaches, such as Respec.
引用
收藏
页码:266 / 269
页数:4
相关论文
共 50 条
  • [41] DetLock: Portable and Efficient Deterministic Execution for Shared Memory Multicore Systems
    Mushtaq, Hamid
    Al-Ars, Zaid
    Bertels, Koen
    [J]. 2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC), 2012, : 721 - 730
  • [42] Maintaining Scalability of Test Generation Using Multicore Shared Memory Systems
    Hadjitheophanous, Stavros
    Neophytou, Stelios N.
    Michael, Maria K.
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (02) : 553 - 564
  • [43] IMPULP: A Hardware Approach for In-Process Memory Protection via User-Level Partitioning
    Yang-Yang Zhao
    Ming-Yu Chen
    Yu-Hang Liu
    Zong-Hao Yang
    Xiao-Jing Zhu
    Zong-Hui Hong
    Yun-Ge Guo
    [J]. Journal of Computer Science and Technology, 2020, 35 : 418 - 432
  • [44] Research and Implementation of User-Level Load Forecasting Based on Load Model Library with Electric Data
    Qiao Junfeng
    Pan Sen
    Yang Pei
    [J]. 2019 4TH INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE 2019), 2019, : 799 - 802
  • [45] uMMAP-IO: User-level Memory-mapped I/O for HPC
    Rivas-Gomez, Sergio
    Fanfarillo, Alessandro
    Valat, Sebastien
    Laferriere, Christophe
    Couvee, Philippe
    Narasimhamurthy, Sai
    Markidis, Stefano
    [J]. 2019 IEEE 26TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS (HIPC), 2019, : 363 - 372
  • [46] FPGA Implementation of a Configurable Cache/Scratchpad Memory with Virtualized User-Level RDMA Capability
    Kalokerinos, George
    Papaefstathiou, Vassilis
    Nikiforos, George
    Kavadias, Stamatis
    Katevenis, Manolis
    Pnevmatikatos, Dionisios
    Yang, Xiaojun
    [J]. 2009 INTERNATIONAL CONFERENCE ON EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING AND SIMULATION, PROCEEDINGS, 2009, : 149 - 156
  • [47] IMPULP: A Hardware Approach for In-Process Memory Protection via User-Level Partitioning
    Zhao, Yang-Yang
    Chen, Ming-Yu
    Liu, Yu-Hang
    Yang, Zong-Hao
    Zhu, Xiao-Jing
    Hong, Zong-Hui
    Guo, Yun-Ge
    [J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2020, 35 (02) : 418 - 432
  • [48] Parallel Performance Problems on Shared-Memory Multicore Systems: Taxonomy and Observation
    Atachiants, Roman
    Doherty, Gavin
    Gregg, David
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2016, 42 (08) : 764 - 785
  • [49] Fast and Accurate Statistical Simulation of Shared-Memory Applications on Multicore Systems
    Jiang, Fan
    Maeda, Rafael K., V
    Feng, Jun
    Chen, Shixi
    Chen, Lin
    Li, Xiao
    Xu, Jiang
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (10) : 2455 - 2469
  • [50] FlashLite: A User-Level Library to Enhance Durability of SSD for P2P File Sharing
    Kim, Hyojun
    Ramachandran, Umakishore
    [J]. 2009 29TH IEEE INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS, 2009, : 534 - 541