PARALLEL SCALABILITY OF THREE-LEVEL FROSch PRECONDITIONERS TO 220000 CORES USING THE THETA SUPERCOMPUTER

被引：2

作者：

Heinlein, Alexander ^{[1
]}

Rheinbach, Oliver ^{[2
,3
]}

Roever, Friederike ^{[2
,3
]}

机构：

[1] Delft Univ Technol, Delft Inst Appl Math, Fac Elect Engn Math Comp Sci, Mekelweg 4, NL-2628 CD Delft, Netherlands

[2] Tech Univ Bergakad Freiberg, Fak Math & Informat, Zentrum effiziente Hochtemperatur Stoffwandlun Ze, D-09596 Freiberg, Germany

[3] Tech Univ Bergakad Freiberg, Fak Math & Informat, Univ rechenzentrum URZ, D-09596 Freiberg, Germany

来源：

SIAM JOURNAL ON SCIENTIFIC COMPUTING | 2023年 / 45卷 / 03期

关键词：

domain decomposition; high performance computing; overlapping Schwarz; software; Trilinos; multilevel preconditioners; DOMAIN DECOMPOSITION; OVERLAPPING SCHWARZ; MULTILEVEL SCHWARZ;

D O I：

10.1137/21M1431205

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

The parallel performance of the three-level fast and robust overlapping Schwarz (FROSch) preconditioners is investigated for linear elasticity. The FROSch framework is part of the Trilinos software library and contains a parallel implementation of different preconditioners with energy minimizing coarse spaces of generalized Dryja-Smith-Widlund type. The three-level extension is constructed by a recursive application of the FROSch preconditioner to the coarse problem. In this paper, the additional steps in the implementation in order to apply the FROSch preconditioner recursively are described in detail. Furthermore, it is shown that no explicit geometric information is needed in the recursive application of the preconditioner. In particular, the rigid body modes, including the rotations, can be interpolated on the coarse level without additional geometric information. Parallel results for a three-dimensional linear elasticity problem obtained on the Theta supercomputer (Argonne Leadership Computing Facility, Argonne, IL) using up to 220 000 cores are discussed and compared to results obtained on the SuperMUC-NG supercomputer (Leibniz Supercomputing Centre, Garching, Germany). Notably, it can be observed that a hierarchical communication operation in FROSch related to the coarse operator starts to dominate the computing time on Theta, which has a dragonfly interconnect, for 100 000 message passing interface (MPI) ranks or more. The same operation, however, scales well and stays within the order of a second in all experiments performed on SuperMUC-NG, which uses a fat tree network. Using hybrid MPI/OpenMP parallelization, the onset of the MPI communication problem on Theta can be delayed. Further analysis of the performance of FROSch on large supercomputers with dragonfly interconnects will be necessary.

引用

页码：S173 / S198

页数：26

共 50 条

[1] PARALLEL SCALABILITY OF THREE-LEVEL FROSch PRECONDITIONERS TO 220000 CORES USING THE THETA SUPERCOMPUTER
Delft University of Technology, Faculty of Electrical Engineering Mathematics & Computer Science, Delft Institute of Applied Mathematics, Mekelweg 4, Delft
2628 CD, Netherlands
不详
09596, Germany
Siam J. Sci. Comput., 3 (S173-S198):
[2] Parallel scalability study of hybrid preconditioners in three dimensions
Giraud, L.
Haidar, A.
Watson, L. T.
PARALLEL COMPUTING, 2008, 34 (6-8) : 363 - 379
[3] Performance and Scalability Analysis for Parallel Reservoir Simulations on Three Supercomputer Architectures
Liu, Hui
Zhang, Peng
Wang, Kun
Yang, Bo
Chen, Zhangxin
PROCEEDINGS OF XSEDE16: DIVERSITY, BIG DATA, AND SCIENCE AT SCALE, 2016,
[4] Three-level modeling of a speed-scaling supercomputer
Rumyantsev, Alexander
Basmadjian, Robert
Astafiev, Sergey
Golovin, Alexander
ANNALS OF OPERATIONS RESEARCH, 2023, 331 (02) : 649 - 677
[5] Three-level modeling of a speed-scaling supercomputer
Alexander Rumyantsev
Robert Basmadjian
Sergey Astafiev
Alexander Golovin
Annals of Operations Research, 2023, 331 : 649 - 677
[6] A Three-Level Parallel Algorithm For MrBayes 3.2
Zhao, Mingjie
Ren, Qiang
Wang, Yilin
Deng, Ruikang
Ren, Mingming
Wang, Gang
Liu, Xiaoguang
2017 15TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS AND 2017 16TH IEEE INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING AND COMMUNICATIONS (ISPA/IUCC 2017), 2017, : 1246 - 1250
[7] B-MIC: An Ultrafast Three-Level Parallel Sequence Aligner Using MIC
Yingbo Cui
Xiangke Liao
Xiaoqian Zhu
Bingqiang Wang
Shaoliang Peng
Interdisciplinary Sciences: Computational Life Sciences, 2016, 8 : 28 - 34
[8] B-MIC: An Ultrafast Three-Level Parallel Sequence Aligner Using MIC
Cui, Yingbo
Liao, Xiangke
Zhu, Xiaoqian
Wang, Bingqiang
Peng, Shaoliang
INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2016, 8 (01) : 28 - 34
[9] A Parallel Solver to the Three-Level VSC Modeling for HIL Application
Liu, Chen
Ma, Rui
Bai, Hao
Gecther, Franck
Gao, Fei
2018 IEEE TRANSPORTATION AND ELECTRIFICATION CONFERENCE AND EXPO (ITEC), 2018, : 108 - 113
[10] Suppression of Circulating Current in Parallel Operation of Three-Level Converters
Son, Young-Kwang
Chee, Seung-Jun
Lee, Younggi
Sul, Seung-Ki
Lim, Changjin
Huh, Sungjae
Oh, Jaeyoon
APEC 2016 31ST ANNUAL IEEE APPLIED POWER ELECTRONICS CONFERENCE AND EXPOSITION, 2016, : 2370 - 2375

← 1 2 3 4 5 →