A Scalable MPI_Comm_split Algorithm for Exascale Computing

被引:0
|
作者
Sack, Paul [1 ]
Gropp, William [1 ]
机构
[1] Univ Illinois, Urbana, IL 61801 USA
关键词
PERFORMANCE;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Existing algorithms for creating communicators in MN programs will not scale well to future exascale supercomputers containing millions of cores. In this work, we present a novel communicator-creation algorithm that does scale well into millions of processes using three techniques: replacing the sorting at the end of MPI_Comm_split with merging as the color and key table is built, sorting the color and key table in parallel, and using a distributed table to store the output communicator data rather than a replicated table. This reduces the time cost of MPI_Comm_split in the worst case we consider from 22 seconds to 0.37 second. Existing algorithms build a table with as many entries as processes, using vast amounts of memory. Our algorithm uses a small, fixed amount of memory per communicator after MPI_Comm_split has finished and uses a fraction of the memory used by the conventional algorithm for temporary storage during the execution of MPI_Comm_split.
引用
收藏
页码:1 / 10
页数:10
相关论文
共 43 条