A parallel run-time iterative load balancing algorithm for solution-adaptive finite element meshes on hypercubes

被引:0
|
作者
Cheng, ML
Chung, YC
机构
[1] FENG CHIA UNIV,DEPT INFORMAT ENGN,TAICHUNG 407,TAIWAN
[2] LING TUNG COLL,DEPT INFORMAT MANAGEMENT,TAICHUNG 408,TAIWAN
关键词
hypercube; load balancing; DIME; mapping; solution-adaptive finite element meshes;
D O I
10.1080/02533839.1996.9677797
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
To efficiently execute a finite element program on a hypercube, we need to map nodes of the corresponding finite element graph to processors of a hypercube such that each processor has approximately the same amount of computational load and the communication among processors is minimized. If the number of nodes of a finite element graph will not be increased during the execution of a program, the mapping only needs to be performed once. However, if a finite element graph is solution-adaptive, that is, the number of nodes will be increased discretely due to the refinement of some finite elements during the execution of a program, a run-time load balancing algorithm has to be performed many times in order to balance the computational load of processors while keeping the communication cost as low as possible. In this paper, we propose a parallel iterative load balancing algorithm (ILB) to deal with the load imbalancing problem of a solution-adaptive finite element program. The proposed algorithm has three properties. First, the algorithm is simple and easy to be implemented. Second, the execution of the algorithm is fast. Third, it guarantees that the computational load will be balanced after the execution of the algorithm. We have implemented the proposed algorithm along with two parallel mapping algorithms, parallel orthogonal recursive bisection (ORE) [19] and parallel recursive mincut bipartitioning (MC) [8], on a 16-node NCUBE-2. Three criteria, the execution time of load balancing algorithms, the computation time of an application program under different load balancing algorithms, and the total execution time of an application program (under several refinement phases) are used for performance evaluation. Experimental results show that (1) the execution time of ILB is very short compared to those of MC and ORE; (2) the mappings produced by ILB are better than those of ORE and MC; and (3) the speedups produced by ILB are better than those of ORE and MC.
引用
收藏
页码:363 / 373
页数:11
相关论文
共 32 条