Fault-tolerance design of the IBM enterprise system/9000 type 9021 processors

被引:0
|
作者
机构
来源
Chen, C.L. | 1600年 / 36期
关键词
Diagnostic strategy - Fault isolation - IBM enterprise system;
D O I
暂无
中图分类号
学科分类号
摘要
The 9021-type processors offer the highest performance of the IBM Enterprise System/9000TM (ES/9000TM) series. They also have the highest levels of concurrent error detection, fault isolation, recovery, and availability of any IBM general-purpose processor. High availability is achieved by minimizing component failure rates through improvements of the base technology, and design techniques that permit hard and soft failure detection, recovery and isolation, and component replacement concurrent with system operation. In this paper, we discuss fault-tolerant design techniques for array, logic, and storage subsystems. We also present diagnostic strategy, fault isolation, and recovery techniques. New features such as the redundant power system and Processor Availability Facility are described. The overall recovery design is described, as well as specific implementation schemes. The design process to verify the error detection, fault isolation, and recovery is also described.
引用
下载
收藏
相关论文
共 50 条
  • [31] Adding fault-tolerance to a hierarchical DRE system
    Rubel, Paul
    Loyall, Joseph
    Schantz, Richard
    Gillen, Matthew
    DISTRIBUTED APPLICATIONS AND INTEROPERABLE SYSTEMS, PROCEEDINGS, 2006, 4025 : 303 - 308
  • [32] REPLICATION AND FAULT-TOLERANCE IN THE ISIS SYSTEM.
    Birman, Kenneth P.
    Operating Systems Review (ACM), 1985, 19 (05): : 79 - 86
  • [33] Design and Realization of a Fault-Tolerance Model to Distributed Simulation System of Hydropower Plant
    Zhang, Binqiao
    Wu, Chengming
    Li, Xianshan
    Wang, Pengyu
    Liu, Rongzhang
    2012 WORLD AUTOMATION CONGRESS (WAC), 2012,
  • [35] Towards a Heterogeneous Fault-Tolerance Architecture based on Arm and RISC-V Processors
    Rodrigues, Cristiano
    Marques, Ivo
    Pinto, Sandro
    Gomes, Tiago
    Tavares, Adriano
    45TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY (IECON 2019), 2019, : 3112 - 3117
  • [36] IMPROVED PERFORMANCE OF IBM ENTERPRISE SYSTEM 9000 BIPOLAR LOGIC CHIPS
    BARISH, AE
    ECKHARDT, JP
    MAYO, MD
    SVARCZKOPF, WA
    GAUR, SP
    IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 1992, 36 (05) : 829 - 834
  • [37] Non-intrusive system level fault-tolerance
    Lundqvist, K
    Srinivasan, J
    Gorelov, S
    RELIABLE SOFTWARE TECHNOLOGY ADA-EUROPE 2005, PROCEEDINGS, 2005, 3555 : 156 - 166
  • [38] Service Based Software Fault-Tolerance for Manufacturing System
    Jeong, HwaYoung
    Hong, BongHwa
    COMPUTER APPLICATIONS FOR SOFTWARE ENGINEERING, DISASTER RECOVERY, AND BUSINESS CONTINUITY, 2012, 340 : 171 - +
  • [39] Deicing System Based on Fault-Tolerance Control for Aircraft
    Tao, Jun
    Xu, Huibin
    Tao, Jianwu
    2008 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY, VOLS 1-5, 2008, : 485 - 488
  • [40] Memshepherd: comprehensive memory bug fault-tolerance system
    Zou, Deqing
    Zheng, Weide
    Jiang, Wenbin
    Jin, Hai
    Chen, Gang
    SECURITY AND COMMUNICATION NETWORKS, 2014, 7 (09) : 1412 - 1419