Domain Decomposition Preconditioners for Communication-Avoiding Krylov Methods on a Hybrid CPU/GPU Cluster

被引:18
|
作者
Yamazaki, Ichitaro [1 ]
Rajamanickam, Sivasankaran [2 ]
Boman, Erik G. [2 ]
Hoemmen, Mark [2 ]
Heroux, Michael A. [2 ]
Tomov, Stanimire [1 ]
机构
[1] Univ Tennessee, Knoxville, TN 37996 USA
[2] Sandia Natl Labs, POB 5800, Albuquerque, NM 87185 USA
关键词
LINEAR-SYSTEMS; IMPLEMENTATION; GMRES;
D O I
10.1109/SC.2014.81
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Krylov subspace projection methods are widely used iterative methods for solving large-scale linear systems of equations. Researchers have demonstrated that communication-avoiding (CA) techniques can improve Krylov methods' performance on modern computers, where communication is becoming increasingly expensive compared to arithmetic operations. In this paper, we extend these studies by two major contributions. First, we present our implementation of a CA variant of the Generalized Minimum Residual (GMRES) method, called CA-GMRES, for solving nonsymmetric linear systems of equations on a hybrid CPU/GPU cluster. Our performance results on up to 120 GPUs show that CA-GMRES gives a speedup of up to 2.5x in total solution time over standard GMRES on a hybrid cluster with twelve Intel Xeon CPUs and three Nvidia Fermi GPUs on each node. We then outline a domain decomposition framework to introduce a family of preconditioners that are suitable for CA Krylov methods. Our preconditioners do not incur any additional communication and allow the easy reuse of existing algorithms and software for the subdomain solves. Experimental results on the hybrid CPU/GPU cluster demonstrate that CA-GMRES with preconditioning achieve a speedup of up to 7.4x over CA-GMRES without preconditioning, and speedup of up to 1.7x over GMRES with preconditioning in total solution time. These results confirm the potential of our framework to develop a practical and effective preconditioned CA Krylov method.
引用
收藏
页码:933 / 944
页数:12
相关论文
共 50 条
  • [21] AVOIDING COMMUNICATION IN NONSYMMETRIC LANCZOS-BASED KRYLOV SUBSPACE METHODS
    Carson, Erin
    Knight, Nicholas
    Demmel, James
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2013, 35 (05): : S42 - S61
  • [22] Multireference coupled cluster methods on heterogeneous CPU-GPU systems
    Bhaskaran-Nair, Kiran
    Ma, Wenjing
    Krishnamoorthy, Sriram
    Villa, Oreste
    van Dam, Hubertus J. J.
    Apra, Edoardo
    Kowalski, Karol
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2013, 246
  • [23] Multi-preconditioned Domain Decomposition Methods in the Krylov Subspaces
    Ilin, Valery P.
    NUMERICAL ANALYSIS AND ITS APPLICATIONS (NAA 2016), 2017, 10187 : 95 - 106
  • [24] Hybrid CPU-GPU Computation of Adjoint Derivatives in Time Domain
    Statz, Christoph
    Muetze, Marco
    Hegler, Sebastian
    Plettemeier, Dirk
    2013 COMPUTATIONAL ELECTROMAGNETICS WORKSHOP (CEM'13), 2013, : 32 - 33
  • [25] Communication-Avoiding Optimization Methods for Distributed Massive-Scale Sparse Inverse Covariance Estimation
    Koanantakool, Penporn
    Ali, Alnur
    Azad, Ariful
    Buluc, Aydn
    Morozov, Dmitriy
    Oliker, Leonid
    Yelick, Katherine
    Oh, Sang-Yun
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
  • [26] Hybrid Discontinuous Galerkin Discretisation and Domain Decomposition Preconditioners for the Stokes Problem
    Barrenechea, Gabriel R.
    Bosy, Michal
    Dolean, Victorita
    Nataf, Frederic
    Tournier, Pierre-Henri
    COMPUTATIONAL METHODS IN APPLIED MATHEMATICS, 2019, 19 (04) : 703 - 722
  • [27] Schwarz domain decomposition preconditioners for plane wave discontinuous galerkin methods
    Antonietti, Paola F
    Perugia, Ilaria
    Davide, Zaliani
    Lecture Notes in Computational Science and Engineering, 2015, 103 : 557 - 572
  • [28] LOW-RANK CORRECTION METHODS FOR ALGEBRAIC DOMAIN DECOMPOSITION PRECONDITIONERS
    Li, Ruipeng
    Saad, Yousef
    SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 2017, 38 (03) : 807 - 828
  • [29] On the Use of Block Low Rank Preconditioners for Primal Domain Decomposition Methods
    Bovet, Christophe
    Gauthier, Theodore
    Gosselet, Pierre
    INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING, 2025, 126 (03)
  • [30] Noniterative Multireference Coupled Cluster Methods on Heterogeneous CPU-GPU Systems
    Bhaskaran-Nair, Kiran
    Ma, Wenjing
    Krishnamoorthy, Sriram
    Villa, Oreste
    van Dam, Hubertus J. J.
    Apra, Edoardo
    Kowalski, Karol
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2013, 9 (04) : 1949 - 1957