共 43 条
A Scalable MPI_Comm_split Algorithm for Exascale Computing
被引:0
|作者:
Sack, Paul
[1
]
Gropp, William
[1
]
机构:
[1] Univ Illinois, Urbana, IL 61801 USA
来源:
关键词:
PERFORMANCE;
D O I:
暂无
中图分类号:
TP3 [计算技术、计算机技术];
学科分类号:
0812 ;
摘要:
Existing algorithms for creating communicators in MN programs will not scale well to future exascale supercomputers containing millions of cores. In this work, we present a novel communicator-creation algorithm that does scale well into millions of processes using three techniques: replacing the sorting at the end of MPI_Comm_split with merging as the color and key table is built, sorting the color and key table in parallel, and using a distributed table to store the output communicator data rather than a replicated table. This reduces the time cost of MPI_Comm_split in the worst case we consider from 22 seconds to 0.37 second. Existing algorithms build a table with as many entries as processes, using vast amounts of memory. Our algorithm uses a small, fixed amount of memory per communicator after MPI_Comm_split has finished and uses a fraction of the memory used by the conventional algorithm for temporary storage during the execution of MPI_Comm_split.
引用
收藏
页码:1 / 10
页数:10
相关论文