Implementation of Parallel Sparse Cholesky Factorization on GPU

被引:0
|
作者
Zou, Dan [1 ]
Dou, Yong [1 ]
机构
[1] Natl Univ Def Technol, Natl Lab Parallel & Distribut Proc, Changsha, Hunan, Peoples R China
关键词
sparse Cholesky factorization; GPU; PERFORMANCE; ALGORITHMS; SOLVER;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Direct methods for solving large sparse symmetric positive-definite linear systems of equations are popular because of their generality and robustness. The main bottleneck is the sparse Cholesky factorization, which exhibits irregular memory access behavior and unbalanced workload. In the past 10 years, many sparse Cholesky factorization algorithms have emerged, exploiting new architectural features. However, programming techniques currently employed on these platforms are not sufficient to implement sparse Cholesky factorization on many-core graphics processing units (GPUs) due to mismatches between irregular problem structures and single-instruction multiple-thread GPU architectures. In the present paper, we propose a task-based software approach for the parallel sparse Cholesky factorization aimed at heterogeneous computing platforms with GPU accelerators. The tasks are generated by CPU. An efficient task-scheduling mechanism guarantees the correct ordering of task execution and ensures a load balanced execution on GPU. Comparisons are made with the existing solver using problems arising from a range of practical applications. The experiment results show that the proposed approach can substantially improve the performance of sparse Cholesky factorization on GPU with 2.7(x)-4(x) speedup.
引用
收藏
页码:2228 / 2232
页数:5
相关论文
共 50 条
  • [21] Modifying a sparse Cholesky factorization
    Davis, TA
    Hager, WW
    SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 1999, 20 (03) : 606 - 627
  • [22] A Fast Batched Cholesky Factorization on a GPU
    Dong, Tingxing
    Haidar, Azzam
    Tomov, Stanimire
    Dongarra, Jack
    2014 43RD INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2014, : 432 - 440
  • [23] Two-dimensional block partitionings for the parallel sparse Cholesky factorization
    B. Dumitrescu
    M. Doreille
    J.-L. Roch
    D. Trystram
    Numerical Algorithms, 1997, 16 : 17 - 38
  • [24] Two-dimensional block partitionings for the parallel sparse Cholesky factorization
    Dumitrescu, B
    Doreille, M
    Roch, JL
    Trystram, D
    NUMERICAL ALGORITHMS, 1997, 16 (01) : 17 - 38
  • [25] AN EFFICIENT BLOCK-ORIENTED APPROACH TO PARALLEL SPARSE CHOLESKY FACTORIZATION
    ROTHBERG, E
    GUPTA, A
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 1994, 15 (06): : 1413 - 1439
  • [26] Sparse LU Factorization for Parallel Circuit Simulation on GPU
    Ren, Ling
    Chen, Xiaoming
    Wang, Yu
    Zhang, Chenxi
    Yang, Huazhong
    2012 49TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2012, : 1125 - 1130
  • [27] EFFICIENT SPARSE CHOLESKY FACTORIZATION ON A MASSIVELY-PARALLEL SIMD COMPUTER
    MANNE, F
    HAFSTEINSSON, H
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 1995, 16 (04): : 934 - 950
  • [28] Tacho: Memory-Scalable Task Parallel Sparse Cholesky Factorization
    Kim, Kyungjoo
    Edwards, H. Carter
    Rajamanickam, Sivasankaran
    2018 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2018), 2018, : 550 - 559
  • [30] Row modifications of a sparse Cholesky factorization
    Davis, TA
    Hager, WW
    SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 2005, 26 (03) : 621 - 639