TASK ALLOCATION AND REALLOCATION FOR FAULT-TOLERANCE IN MULTICOMPUTER SYSTEMS

被引：6

作者：

CHEN, CIH ^{[1
]}

CHERKASSKY, V ^{[1
]}

机构：

[1] UNIV MINNESOTA,DEPT ELECT ENGN,MINNEAPOLIS,MN 55455

来源：

IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS | 1994年 / 30卷 / 04期

关键词：

D O I：

10.1109/7.328753

中图分类号：

V [航空、航天];

学科分类号：

08 ; 0825 ;

摘要：

The goal of task allocation in a set of interconnected processors (computers) is to maximize the efficient use of resources and thus reduce the job turnaround time. Proposed here a simple yet effective method to allocate the tasks in multicomputer systems for minimizing the interprocessor communication cost subject to resource limitations defined by the system and designer. The limitations can be viewed as results from the load balancing since the execution time of each task, the number of available processors, processor speed, and memory capacity are known to the system or designer. As the number of processors increases, the probability of a failure existing somewhere in the systems at any time also increases. Very few established task allocation models have considered the reliability property. In multicomputer systems, we define system reliability as the probability that the system can run the tasks successfully. After the (nonredundant) task scheduling strategy is defined, tasks are then reallocated to processors statically and redundantly. This is a form of time redundancy, in which if some processors fail during the execution, all tasks can be completed on the remaining processors (but at a longer time). Due to static preallocation of tasks this method is simpler and thus more practical than well-known dynamic reconfiguration and rollback recovery techniques in multicomputer systems. We demonstrate the effectiveness of the task allocation and reallocation for hardware fault tolerance by illustrations of applying the methods to different examples and practical communications network multiprocessor systems.

引用

页码：1094 / 1104

页数：11

共 50 条

[1] ORGANIZATION OF TASK ALLOCATION IN COMPUTING SYSTEMS THAT ENSURES THEIR FAULT-TOLERANCE
TURUTA, EN
[J]. AVTOMATIKA I VYCHISLITELNAYA TEKHNIKA, 1985, (01): : 5 - 14
[2] Mechanisms of operating systems supporting fault-tolerance of multicomputer control systems
Mamedli, EM
Sobolev, NA
[J]. AUTOMATION AND REMOTE CONTROL, 1995, 56 (08) : 1065 - 1105
[3] Practical task allocation for software fault-tolerance and its implementation in embedded automotive systems
Bhat, Anand
Samii, Soheil
Rajkumar, Ragunathan
[J]. REAL-TIME SYSTEMS, 2019, 55 (04) : 889 - 924
[4] Practical task allocation for software fault-tolerance and its implementation in embedded automotive systems
Anand Bhat
Soheil Samii
Ragunathan Rajkumar
[J]. Real-Time Systems, 2019, 55 : 889 - 924
[5] Practical Task Allocation for Software Fault-Tolerance and Its Implementation in Embedded Automotive Systems
Bhat, Anand
Samii, Soheil
Rajkumar, Ragunathan
[J]. PROCEEDINGS OF THE 23RD IEEE REAL-TIME AND EMBEDDED TECHNOLOGY AND APPLICATIONS SYMPOSIUM (RTAS 2017), 2017, : 87 - 97
[6] REDUNDANT TASK-ALLOCATION IN MULTICOMPUTER SYSTEMS
CHERKASSKY, V
CHEN, CIH
[J]. IEEE TRANSACTIONS ON RELIABILITY, 1992, 41 (03) : 336 - 342
[7] Fault-tolerance in biochemical systems
Winfree, Erik
[J]. UNCONVENTIONAL COMPUTATION, PROCEEDINGS, 2006, 4135 : 26 - 26
[8] Task scheduling with fault-tolerance in real-time heterogeneous systems
Liu, Jing
Wei, Mengxue
Hu, Wei
Xu, Xin
Ouyang, Aijia
[J]. JOURNAL OF SYSTEMS ARCHITECTURE, 2018, 90 : 23 - 33
[9] Deploying fault-tolerance and task migration with NetSolve
Plank, JS
Casanova, H
Beck, M
Dongarra, J
[J]. APPLIED PARALLEL COMPUTING: LARGE SCALE SCIENTIFIC AND INDUSTRIAL PROBLEMS, 1998, 1541 : 418 - 432
[10] OPERATING-SYSTEMS AND FAULT-TOLERANCE
SCHLICHTING, RD
[J]. LECTURE NOTES IN COMPUTER SCIENCE, 1991, 563 : 150 - 153

← 1 2 3 4 5 →