Availability in parallel systems: Automatic process restart

被引:3
|
作者
Bowen, NS [1 ]
Antognini, J [1 ]
Regan, RD [1 ]
Matsakis, NC [1 ]
机构
[1] IBM CORP,DIV S390,POUGHKEEPSIE,NY 12601
关键词
Availability - Computer architecture - Computer operating systems - Computer system recovery - Data processing - Information retrieval systems;
D O I
10.1147/sj.362.0284
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Parallel and clustered architectures are increasingly being used as a foundation for high-capacity servers. At the same time, the availability expectations are also rising rapidly, since the effects of down time become more apparent and have higher economic consequences for larger systems. The use of parallel structures generally implies more hardware and software components. The presence of more and larger components increases the chances that an individual component will fail, and that failure has the potential to hurt the overall availability of the system. This paper discusses the use of ''restart techniques'' as an important strategy in providing increased availability in a parallel structure. The paper covers a set of functions that have been developed for the S/390(R) Parallel Sysplex(TM).
引用
收藏
页码:284 / 300
页数:17
相关论文
共 50 条
  • [22] Automatic generation of parallel programs for MIMD computer systems
    Kostenko, VA
    CYBERNETICS AND SYSTEMS ANALYSIS, 1995, 31 (05) : 772 - 778
  • [23] Automatic performance diagnosis of parallel applications on heterogeneous systems
    Zhan, Kunlin
    Xu, Jungang
    Zhan, Jianfeng
    International Journal of Digital Content Technology and its Applications, 2012, 6 (02) : 1 - 9
  • [24] AUTOMATIC LANDING SYSTEMS MECHANIZATION - SERIES OR PARALLEL SERVOS
    HOFFMAN, DP
    KAWANA, H
    SAE TRANSACTIONS, 1966, 74 : 134 - &
  • [25] Applying the Parallel Systems Approach to Automatic Container Terminal
    Zheng S.
    Wu X.-L.
    Wang F.-Y.
    Lin D.-D.
    Zheng R.
    Ke W.-L.
    Chi X.-D.
    Chen D.-W.
    Zidonghua Xuebao/Acta Automatica Sinica, 2019, 45 (03): : 490 - 504
  • [26] Automatic Generation of Software Pipelines for Heterogeneous Parallel Systems
    Pienaar, Jacques A.
    Chakradhar, Srimat
    Raghunathan, Anand
    2012 INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC), 2012,
  • [27] Online scheduling of equal length jobs on a bounded parallel batch machine with restart or limited restart
    Liu, Hailing
    Yuan, Jinjiang
    THEORETICAL COMPUTER SCIENCE, 2014, 543 : 24 - 36
  • [28] AUTOMATIC RESTART CONTROL-SYSTEM FOR ELECTROCHEMICAL MACHINING
    NEAL, RE
    PRECISION ENGINEERING-JOURNAL OF THE AMERICAN SOCIETY FOR PRECISION ENGINEERING, 1986, 8 (03): : 139 - 143
  • [29] Distributed control systems - Standards and availability for process control
    Ellis, JE
    MEASUREMENT & CONTROL, 1996, 29 (02): : 41 - 45
  • [30] Bootstrapping comparison on availability of parallel systems with non-identical components
    Ke, Jau-Chuan
    Chu, Yunn-Kuang
    Lee, Jia-Huei
    ENGINEERING COMPUTATIONS, 2008, 25 (7-8) : 801 - 816