Harnessing CUDA Dynamic Parallelism for the Solution of Sparse Linear Systems

被引:1
|
作者
Aliaga, Jose [1 ]
Davidovic, Davor [2 ]
Perez, Joaquin [1 ]
Quintana-Orti, Enrique S. [1 ]
机构
[1] Univ Jaume 1, Dept Ingn Ciencia Comp, Castellon de La Plana, Spain
[2] Inst Ruder Baskovic, Ctr Informat & Racunarstvo CIR, Zagreb, Croatia
关键词
Graphics processing units (GPUs); CUDA dynamic parallelism; sparse linear systems; iterative solvers; high performance; energy efficiency;
D O I
10.3233/978-1-61499-621-7-217
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We leverage CUDA dynamic parallelism to reduce execution time while significantly reducing energy consumption of the Conjugate Gradient (CG) method for the iterative solution of sparse linear systems on graphics processing units (GPUs). Our new implementation of this solver is launched from the CPU in the form of a single "parent" CUDA kernel, which invokes other "child" CUDA kernels. The CPU can then continue with other work while the execution of the solver proceeds asynchronously on the GPU, or block until the execution is completed. Our experiments on a server equipped with an Intel Core i7-3770K CPU and an NVIDIA "Kepler" K20c GPU illustrate the benefits of the new CG solver.
引用
收藏
页码:217 / 226
页数:10
相关论文
共 50 条
  • [41] Remarks concerning linear systems with parallelism
    Ostrom, TG
    MOSTLY FINITE GEOMETRIES: IN CELEBRATION OF T G OSTROM'S 80TH BIRTHDAY, 1997, 190 : 1 - 7
  • [42] PRECONDITIONING LINEAR-SYSTEMS AND PARALLELISM
    CODENOTTI, B
    LEONCINI, M
    COMPUTERS AND ARTIFICIAL INTELLIGENCE, 1990, 9 (05): : 471 - 491
  • [43] Systematic Fusion of CUDA Kernels for Iterative Sparse Linear System Solvers
    Aliaga, Jose I.
    Perez, Joaquin
    Quintana-Orti, Enrique S.
    EURO-PAR 2015: PARALLEL PROCESSING, 2015, 9233 : 675 - 686
  • [45] A numerical evaluation of sparse direct solvers for the solution of large sparse symmetric linear systems of equations
    Gould, Nicholas I. M.
    Scott, Jennifer A.
    Hu, Yifan
    ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2007, 33 (02):
  • [46] Attract-Repulse Fireworks Algorithm and its CUDA Implementation Using Dynamic Parallelism
    Ding, Ke
    Tan, Ying
    INTERNATIONAL JOURNAL OF SWARM INTELLIGENCE RESEARCH, 2015, 6 (02) : 1 - 31
  • [47] Accelerating BIRCH for Clustering Large Scale Streaming Data Using CUDA Dynamic Parallelism
    Dong, Jianqiang
    Wang, Fei
    Yuan, Bo
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2013, 2013, 8206 : 409 - 416
  • [48] AN ITERATIVE SOLUTION METHOD FOR SOLVING SPARSE NONSYMMETRIC LINEAR-SYSTEMS
    MAKINSON, GJ
    SHAH, AA
    JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 1986, 15 (03) : 339 - 352
  • [49] Numerical performance of preconditioning techniques for the solution of complex sparse linear systems
    Mazzia, A
    Pini, G
    COMMUNICATIONS IN NUMERICAL METHODS IN ENGINEERING, 2003, 19 (01): : 37 - 48
  • [50] Application of the Cramer rule in the solution of sparse systems of linear algebraic equations
    Mittal, RC
    Al-Kurdi, A
    JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2001, 136 (1-2) : 1 - 15