Domain Decomposition Preconditioners for Communication-Avoiding Krylov Methods on a Hybrid CPU/GPU Cluster

被引:18
|
作者
Yamazaki, Ichitaro [1 ]
Rajamanickam, Sivasankaran [2 ]
Boman, Erik G. [2 ]
Hoemmen, Mark [2 ]
Heroux, Michael A. [2 ]
Tomov, Stanimire [1 ]
机构
[1] Univ Tennessee, Knoxville, TN 37996 USA
[2] Sandia Natl Labs, POB 5800, Albuquerque, NM 87185 USA
关键词
LINEAR-SYSTEMS; IMPLEMENTATION; GMRES;
D O I
10.1109/SC.2014.81
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Krylov subspace projection methods are widely used iterative methods for solving large-scale linear systems of equations. Researchers have demonstrated that communication-avoiding (CA) techniques can improve Krylov methods' performance on modern computers, where communication is becoming increasingly expensive compared to arithmetic operations. In this paper, we extend these studies by two major contributions. First, we present our implementation of a CA variant of the Generalized Minimum Residual (GMRES) method, called CA-GMRES, for solving nonsymmetric linear systems of equations on a hybrid CPU/GPU cluster. Our performance results on up to 120 GPUs show that CA-GMRES gives a speedup of up to 2.5x in total solution time over standard GMRES on a hybrid cluster with twelve Intel Xeon CPUs and three Nvidia Fermi GPUs on each node. We then outline a domain decomposition framework to introduce a family of preconditioners that are suitable for CA Krylov methods. Our preconditioners do not incur any additional communication and allow the easy reuse of existing algorithms and software for the subdomain solves. Experimental results on the hybrid CPU/GPU cluster demonstrate that CA-GMRES with preconditioning achieve a speedup of up to 7.4x over CA-GMRES without preconditioning, and speedup of up to 1.7x over GMRES with preconditioning in total solution time. These results confirm the potential of our framework to develop a practical and effective preconditioned CA Krylov method.
引用
收藏
页码:933 / 944
页数:12
相关论文
共 50 条
  • [41] DYNAMIC AUTOTUNING OF ADAPTIVE FAST MULTIPOLE METHODS ON HYBRID MULTICORE CPU AND GPU SYSTEMS
    Holm, Marcus
    Engblom, Stefan
    Goude, Anders
    Holmgren, Sverker
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2014, 36 (04): : C376 - C399
  • [42] Hybrid CPU/GPU Integral Engine for Strong-Scaling Ab Initio Methods
    Kussmann, Joerg
    Ochsenfeld, Christian
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2017, 13 (07) : 3153 - 3159
  • [43] Local preconditioners for two-level non-overlapping domain decomposition methods
    Carvalho, LM
    Giraud, L
    Meurant, G
    NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS, 2001, 8 (04) : 207 - 227
  • [44] Accelerating frequency-domain simulations using small shared-memory CPU/GPU cluster
    Topa, Tomasz
    Noga, Artur
    Karwowski, Andrzej
    2016 21ST INTERNATIONAL CONFERENCE ON MICROWAVE, RADAR AND WIRELESS COMMUNICATIONS (MIKON), 2016,
  • [45] Learning Driven Parallelization for Large-Scale Video Workload in Hybrid CPU-GPU Cluster
    Zhang, Haitao
    Tang, Bingchang
    Geng, Xin
    Ma, Huadong
    PROCEEDINGS OF THE 47TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, 2018,
  • [46] Comparison of Two-Level Preconditioners Derived from Deflation, Domain Decomposition and Multigrid Methods
    Tang, J. M.
    Nabben, R.
    Vuik, C.
    Erlangga, Y. A.
    JOURNAL OF SCIENTIFIC COMPUTING, 2009, 39 (03) : 340 - 370
  • [47] Comparison of Two-Level Preconditioners Derived from Deflation, Domain Decomposition and Multigrid Methods
    J. M. Tang
    R. Nabben
    C. Vuik
    Y. A. Erlangga
    Journal of Scientific Computing, 2009, 39 : 340 - 370
  • [48] Hybrid CPU–GPU implementation of the transformed spatial domain channel estimation algorithm for mmWave MIMO systems
    Diego Lloria
    Pablo M. Aviles
    Jose A. Belloch
    Sandra Roger
    Carmen Botella-Mascarell
    Almudena Lindoso
    The Journal of Supercomputing, 2023, 79 : 9371 - 9382
  • [49] An acceleration technique for 2D MOC based on Krylov subspace and domain decomposition methods
    Zhang, Hongbo
    Wu, Hongchun
    Cao, Liangzhi
    ANNALS OF NUCLEAR ENERGY, 2011, 38 (12) : 2742 - 2751
  • [50] An acceleration technique for 2D MOC based on Krylov subspace and domain decomposition methods
    School of Nuclear Science and Technology, Xi'An Jiaotong University, Xi'an Shaanxi 710049, China
    Ann Nucl Energy, 1600, 12 (2742-2751):