A Communication Optimization Scheme for Basis Computation of Krylov Subspace Methods on Multi-GPUs

被引:0
|
作者
Chen, Langshi [1 ]
Petiton, Serge G. [1 ,2 ]
Drummond, Leroy A. [3 ]
Hugues, Maxime [4 ]
机构
[1] Digiteo Labs Bat 565 PC 190, Maison Simulat, USR3441, F-91191 Gif Sur Yvette, France
[2] Univ Sci & Technol Lille, Lab Informat Fondamentale Lille, F-59650 Villeneuve Dascq, France
[3] Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Berkeley, CA 94720 USA
[4] INRIA Saclay, F-91120 Palaiseau, France
关键词
Krylov subspace; Auto-tuning; Arnoldi orthogonalization;
D O I
10.1007/978-3-319-17353-5_1
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Krylov Subspace Methods (KSMs) are widely used for solving large-scale linear systems and eigenproblems. However, the computation of Krylov subspace bases suffers from the overhead of performing global reduction operations when computing the inner vector products in the orthogonalization steps. In this paper, a hypergraph based communication optimization scheme is applied to Arnoldi and incomplete Arnoldi methods of forming Krylov subspace basis from sparse matrix, and features of these methods are compared in a analytical way. Finally, experiments on a CPU-GPU heterogeneous cluster show that our optimization improves the Arnoldi methods implementations for a generic matrix, and a benefit of up to 10x speedup for some special diagonal structured matrix. The performance advantage also varies for different subspace sizes and matrix formats, which requires a further integration of auto-tuning strategy.
引用
收藏
页码:3 / 16
页数:14
相关论文
共 50 条
  • [1] Alternatives for parallel Krylov subspace basis computation
    Sidje, RB
    NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS, 1997, 4 (04) : 305 - 331
  • [2] Krylov subspace methods for large multidimensional eigenvalue computation
    El Hachimi, Anas
    Jbilou, Khalide
    Ratnani, Ahmed
    APPLIED NUMERICAL MATHEMATICS, 2025, 208 : 205 - 221
  • [3] Using Quadruple Precision Arithmetic to Accelerate Krylov Subspace Methods on GPUs
    Mukunoki, Daichi
    Takahashi, Daisuke
    PARALLEL PROCESSING AND APPLIED MATHEMATICS (PPAM 2013), PT I, 2014, 8384 : 632 - 642
  • [4] Adaptive optimization modeling of preconditioned conjugate gradient on Multi-GPUs
    Gao J.
    Wang Y.
    Wang J.
    Liang R.
    ACM Transactions on Parallel Computing, 2016, 3 (03) : 1 - 33
  • [5] Krylov subspace methods for radial basis function interpolation
    Faul, AC
    Powell, MJD
    NUMERICAL ANALYSIS 1999, 2000, 420 : 115 - 141
  • [6] Speedup of Magnetic-Electric Matrices Assembly Computation by Means of a Multi-GPUs Environment
    Chiariello, A. G.
    Mastrostefano, S.
    Nicolazzo, M.
    Rubinacci, G.
    Tamburrino, A.
    Ventre, S.
    Villone, F.
    IEEE TRANSACTIONS ON MAGNETICS, 2016, 52 (03)
  • [7] Block Krylov subspace methods for the computation of structural response to turbulent wind
    Barbella, G.
    Perotti, F.
    Simoncini, V.
    COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2011, 200 (23-24) : 2067 - 2082
  • [8] ENLARGED KRYLOV SUBSPACE CONJUGATE GRADIENT METHODS FOR REDUCING COMMUNICATION
    Grigori, Laura
    Moufawad, Sophie
    Nataf, Frederic
    SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 2016, 37 (02) : 744 - 773
  • [9] Efficient isogeometric topology optimization via multi-GPUs and CPUs heterogeneous architecture
    Han, Jinpeng
    Zhang, Haobo
    Gao, Baichuan
    Yu, Jingui
    Jin, Peng
    Yang, Jianzhong
    Xia, Zhaohui
    OPTIMIZATION AND ENGINEERING, 2024,
  • [10] AVOIDING COMMUNICATION IN NONSYMMETRIC LANCZOS-BASED KRYLOV SUBSPACE METHODS
    Carson, Erin
    Knight, Nicholas
    Demmel, James
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2013, 35 (05): : S42 - S61