Parallel ILU preconditioners in GPU computation

被引:0
|
作者
Yan Chen
Xuhong Tian
Hui Liu
Zhangxin Chen
Bo Yang
Wenyuan Liao
Peng Zhang
Ruijian He
Min Yang
机构
[1] University of Calgary,Department of Chemical and Petroleum Engineering
[2] University of Calgary,Department of Mathematics and Statistics
[3] South China Agricultural University,College of Mathematics and Informatics
[4] Stony Brook University,Biomedical Engineering Department
来源
Soft Computing | 2018年 / 22卷
关键词
ILU; Block-wise matrix; Parallel computing; GPU; Preconditioner;
D O I
暂无
中图分类号
学科分类号
摘要
Accelerating large-scale linear solvers is always crucial for scientific research and industrial applications. In this regard, preconditioners play a key role in improving the performance of iterative linear solvers. This paper presents a summary and review of our work about the development of parallel ILU preconditioners on GPUs. The mechanisms of ILU(0), ILU(k), ILUT, enhanced ILUT, and block-wise ILU(k) are reviewed and analyzed, which give a clear guidance in the development of iterative linear solvers. ILU(0) is the most commonly used preconditioner, and the nonzero pattern of its matrix is exactly the same as the original matrix to be solved. ILU(k) uses k levels to control the pattern of its preconditioner matrix. ILUT selects entries for its preconditioner matrix by setting thresholds without considering its original matrix pattern. In addition to point-wise ILU preconditioners, a block-wise ILU(k) preconditioner is designed delicately in support of block-wise matrices. In implementation, the RAS (Restricted Additive Schwarz) method is adopted to optimize the parallel structure of a preconditioner matrix. Coupling with the configuration parameters of ILU preconditioners, a complex situation appears in the parallel solution process, so decoupled algorithms are adopted. These algorithms are implemented and tested on NVIDIA GPUs. The experiment results show that a single-GPU implementation can speed up an ILU preconditioner by a factor of 10, compared to traditional CPU implementation. The results also show that the ILU(0) has better speedup than ILU(k) but slower convergence than ILU(k). Level k of ILU(k) and threshold (p, t) of ILUT are effective adjustment factors for controlling the equilibrium point between acceleration and convergence for ILU(k) and ILUT, respectively. All these ILU preconditioners are characterized and compared in this work, which shows a clear picture and numerical insights for practitioners in the ILU family.
引用
收藏
页码:8187 / 8205
页数:18
相关论文
共 50 条
  • [1] Parallel ILU preconditioners in GPU computation
    Chen, Yan
    Tian, Xuhong
    Liu, Hui
    Chen, Zhangxin
    Yang, Bo
    Liao, Wenyuan
    Zhang, Peng
    He, Ruijian
    Yang, Min
    [J]. SOFT COMPUTING, 2018, 22 (24) : 8187 - 8205
  • [2] Design, Tuning and Evaluation of Parallel Multilevel ILU Preconditioners
    Aliaga, Jose I.
    Bollhoefer, Matthias
    Martin, Alberto F.
    Quintana-Orti, Enrique S.
    [J]. HIGH PERFORMANCE COMPUTING FOR COMPUTATIONAL SCIENCE - VECPAR 2008, 2008, 5336 : 314 - +
  • [3] Distributed block independent set algorithms and parallel multilevel ILU preconditioners
    Shen, C
    Zhang, J
    Wang, K
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2005, 65 (03) : 331 - 346
  • [4] Parallel performance of block ILU preconditioners for a block-tridiagonal matrix
    Yun, JH
    [J]. JOURNAL OF SUPERCOMPUTING, 2003, 24 (01): : 69 - 89
  • [5] Block and full matrix ILU preconditioners for parallel finite element solvers
    Wille, SO
    Staff, O
    Loula, AFD
    [J]. COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2002, 191 (13-14) : 1381 - 1394
  • [6] Parallel Performance of Block ILU Preconditioners for a Block-tridiagonal Matrix
    Jae Heon Yun
    [J]. The Journal of Supercomputing, 2003, 24 (1) : 69 - 89
  • [7] Performance comparison of parallel ILU preconditioners for the incompressible Navier-Stokes equations
    Sungwoo Kang
    Long Cu Ngo
    Hyounggwon Choi
    Wanjin Chung
    Yo-Han Yoo
    Jung Yul Yoo
    [J]. Journal of Mechanical Science and Technology, 2020, 34 : 1175 - 1184
  • [8] Performance comparison of parallel ILU preconditioners for the incompressible Navier-Stokes equations
    Kang, Sungwoo
    Ngo, Long Cu
    Choi, Hyounggwon
    Chung, Wanjin
    Yoo, Yo-Han
    Yoo, Jung Yul
    [J]. JOURNAL OF MECHANICAL SCIENCE AND TECHNOLOGY, 2020, 34 (03) : 1175 - 1184
  • [9] Experimental study of ILU preconditioners for indefinite matrices
    Chow, E
    Saad, Y
    [J]. JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 1997, 86 (02) : 387 - 414
  • [10] The Gravity Parallel Computation Based on GPU
    Wang Kefan
    Li Ge
    [J]. PROCEEDINGS OF 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2017, : 2409 - 2413