A Hierarchical Jacobi Iteration for Structured Matrices on GPUs using Shared Memory

被引:0
|
作者
Islam, Mohammad Shafaet [1 ]
Wang, Qiqi [1 ]
机构
[1] MIT, Dept Aeronaut & Astronaut, Cambridge, MA 02139 USA
关键词
D O I
10.1109/HPEC55821.2022.9926410
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents an algorithm to accelerate the Jacobi iteration for solving linear systems of equations arising from structured problems on graphics processing units (GPUs). Acceleration is achieved by utilization of on-chip GPU shared memory via a domain decomposition procedure. In particular, the problem domain is partitioned into subdomains whose data is copied to the shared memory of each GPU block. Jacobi iterations are performed internally within each block s shared memory while avoiding expensive global memory accesses every iteration, resulting in a hierarchical algorithm (which takes advantage of the GPU memory hierarchy). We investigate the algorithm performance on the linear systems arising from the discretization of Poisson s equation in 1D and 2D, and observe an 8x speedup in convergence in the 1D problem and a nearly 6x speedup in 2D compared to a conventional GPU implementation of Jacobi iteration which only relies on global memory.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] Implementing Asynchronous Jacobi Iteration on GPUs
    Tsai, Yu-Hsiang Mike
    Nayak, Pratik
    Chow, Edmond
    Anzt, Hartwig
    [J]. Proceedings of ScalAH 2022: 13th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Heterogeneous Systems, Held in conjunction with SC 2022: The International Conference for High Performance Computing, Networking, Storage and Analysis, 2022, : 1 - 9
  • [2] The preconditioned inverse iteration for hierarchical matrices
    Benner, Peter
    Mach, Thomas
    [J]. NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS, 2013, 20 (01) : 150 - 166
  • [3] ON RELAXATION OF JACOBI ITERATION FOR CONSISTENT AND GENERALIZED MASS MATRICES
    WATHEN, AJ
    [J]. COMMUNICATIONS IN APPLIED NUMERICAL METHODS, 1991, 7 (02): : 93 - 102
  • [4] USING SHARED MEMORY AS A CACHE IN CELLULAR AUTOMATA WATER FLOW SIMULATIONS ON GPUs
    Topa, Pawel
    Locek, Pawel M.
    [J]. COMPUTER SCIENCE-AGH, 2013, 14 (03): : 385 - 401
  • [5] A Study of the Memory Wall within the Jacobi Iteration Method
    Sun, Siqi
    Wang, Shan
    Shen, Wenfeng
    Xu, Weimin
    Zheng, Yanheng
    [J]. 2012 IEEE 14TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2012 IEEE 9TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (HPCC-ICESS), 2012, : 964 - 969
  • [6] Structured matrices and Newton's iteration: unified approach
    Pan, VY
    Rami, Y
    Wang, XM
    [J]. LINEAR ALGEBRA AND ITS APPLICATIONS, 2002, 343 : 233 - 265
  • [7] Efficient Batched Predecessor Search in Shared Memory on GPUs
    Karsin, Ben
    Casanova, Henri
    Sitchinava, Nodari
    [J]. 2015 IEEE 22ND INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2015, : 335 - 344
  • [8] OSM: Off-Chip Shared Memory for GPUs
    Darabi, Sina
    Yousefzadeh-Asl-Miandoab, Ehsan
    Akbarzadeh, Negar
    Falahati, Hajar
    Lotfi-Kamran, Pejman
    Sadrosadati, Mohammad
    Sarbazi-Azad, Hamid
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (12) : 3415 - 3429
  • [9] Pragma Directed Shared Memory Centric Optimizations on GPUs
    Li, Jing
    Liu, Lei
    Wu, Yuan
    Liu, Xiang-Hua
    Gao, Yi
    Feng, Xiao-Bing
    Wu, Cheng-Yong
    [J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2016, 31 (02) : 235 - 252
  • [10] PARAMETRIZATION OF NEWTON ITERATION FOR COMPUTATIONS WITH STRUCTURED MATRICES AND APPLICATIONS
    PAN, V
    [J]. COMPUTERS & MATHEMATICS WITH APPLICATIONS, 1992, 24 (03) : 61 - 75