Domain Decomposition Preconditioners for Communication-Avoiding Krylov Methods on a Hybrid CPU/GPU Cluster

被引:18
|
作者
Yamazaki, Ichitaro [1 ]
Rajamanickam, Sivasankaran [2 ]
Boman, Erik G. [2 ]
Hoemmen, Mark [2 ]
Heroux, Michael A. [2 ]
Tomov, Stanimire [1 ]
机构
[1] Univ Tennessee, Knoxville, TN 37996 USA
[2] Sandia Natl Labs, POB 5800, Albuquerque, NM 87185 USA
关键词
LINEAR-SYSTEMS; IMPLEMENTATION; GMRES;
D O I
10.1109/SC.2014.81
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Krylov subspace projection methods are widely used iterative methods for solving large-scale linear systems of equations. Researchers have demonstrated that communication-avoiding (CA) techniques can improve Krylov methods' performance on modern computers, where communication is becoming increasingly expensive compared to arithmetic operations. In this paper, we extend these studies by two major contributions. First, we present our implementation of a CA variant of the Generalized Minimum Residual (GMRES) method, called CA-GMRES, for solving nonsymmetric linear systems of equations on a hybrid CPU/GPU cluster. Our performance results on up to 120 GPUs show that CA-GMRES gives a speedup of up to 2.5x in total solution time over standard GMRES on a hybrid cluster with twelve Intel Xeon CPUs and three Nvidia Fermi GPUs on each node. We then outline a domain decomposition framework to introduce a family of preconditioners that are suitable for CA Krylov methods. Our preconditioners do not incur any additional communication and allow the easy reuse of existing algorithms and software for the subdomain solves. Experimental results on the hybrid CPU/GPU cluster demonstrate that CA-GMRES with preconditioning achieve a speedup of up to 7.4x over CA-GMRES without preconditioning, and speedup of up to 1.7x over GMRES with preconditioning in total solution time. These results confirm the potential of our framework to develop a practical and effective preconditioned CA Krylov method.
引用
收藏
页码:933 / 944
页数:12
相关论文
共 50 条
  • [1] Communication-Avoiding Tile QR Decomposition on CPU/GPU Heterogeneous Cluster System
    Takayanagi, Masatoshi
    Suzuki, Tomohiro
    2018 IEEE 12TH INTERNATIONAL SYMPOSIUM ON EMBEDDED MULTICORE/MANY-CORE SYSTEMS-ON-CHIP (MCSOC 2018), 2018, : 125 - 131
  • [2] Communication-Avoiding Krylov Techniques on Graphic Processing Units
    MehriDehnavi, Maryam
    El-Kurdi, Yousef
    Demmel, James
    Giannacopoulos, Dennis
    IEEE TRANSACTIONS ON MAGNETICS, 2013, 49 (05) : 1749 - 1752
  • [3] A class of communication-avoiding algorithms for solving general dense linear systems on CPU/GPU parallel machines
    Baboulin, Marc
    Donfack, Simplice
    Dongarra, Jack
    Grigori, Laura
    Remy, Adrien
    Tomov, Stanimire
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, ICCS 2012, 2012, 9 : 17 - 26
  • [4] A new era in scientific computing: Domain decomposition methods in hybrid CPU-GPU architectures
    Papadrakakis, M.
    Stavroulakis, G.
    Karatarakis, A.
    COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2011, 200 (13-16) : 1490 - 1508
  • [5] Implementation and performance evaluation of a communication-avoiding GMRES method for stencil-based code on GPU cluster
    Matsumoto, Kazuya
    Idomura, Yasuhiro
    Ina, Takuya
    Mayumi, Akie
    Yamada, Susumu
    JOURNAL OF SUPERCOMPUTING, 2019, 75 (12): : 8115 - 8146
  • [6] Implementation and performance evaluation of a communication-avoiding GMRES method for stencil-based code on GPU cluster
    Kazuya Matsumoto
    Yasuhiro Idomura
    Takuya Ina
    Akie Mayumi
    Susumu Yamada
    The Journal of Supercomputing, 2019, 75 : 8115 - 8146
  • [7] Preconditioners for domain decomposition methods
    Juvigny, X
    Ryan, J
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2001, 42 (8-9) : 1143 - 1155
  • [8] Towards Efficient Decomposition and Parallelization of MPDATA on Hybrid CPU-GPU Cluster
    Wyrzykowski, Roman
    Szustak, Lukasz
    Rojek, Krzysztof
    Tomas, Adam
    LARGE-SCALE SCIENTIFIC COMPUTING, LSSC 2013, 2014, 8353 : 457 - 464
  • [9] Preconditioners for nonconforming domain decomposition methods
    Rodrigues, JA
    JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 1999, 111 (1-2) : 227 - 237
  • [10] Preconditioners for nonconforming domain decomposition methods
    Rodrigues, José Alberto
    Journal of Computational and Applied Mathematics, 1999, 111 (01): : 227 - 237