Replicated process allocation for load distribution in fault-tolerant multicomputers

被引:6
|
作者
Kim, J [1 ]
Lee, H [1 ]
Lee, S [1 ]
机构
[1] POHANG UNIV SCI & TECHNOL, DEPT ELECT ENGN, POHANG 790784, SOUTH KOREA
关键词
backup process; checkpointing; fault-tolerant multicomputer; load balancing; process allocation;
D O I
10.1109/12.588067
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we consider a load-balancing process allocation method for fault-tolerant multicomputer systems that balances the load before as well as after faults start to degrade the performance of the system. In order to be able to tolerate a single fault, each process (primary process) is duplicated (i.e., has a backup process). The backup process executes on a different processor from the primary, checkpointing the primary process and recovering the process if the primary process fails. In this paper, we formalize the problem of load-balancing process allocation and propose a new process allocation method and analyze the performance of the proposed method. Simulations are used to compare the proposed method with a process allocation method that does not take into account the different load characteristics of the primary and backup processes. While both methods perform well before the occurrence of a fault, only the proposed method maintains a balanced load after the occurrence of such a fault.
引用
收藏
页码:499 / 505
页数:7
相关论文
共 50 条