A Hierarchical Jacobi Iteration for Structured Matrices on GPUs using Shared Memory

被引：0

作者：

Islam, Mohammad Shafaet ^{[1
]}

Wang, Qiqi ^{[1
]}

机构：

[1] MIT, Dept Aeronaut & Astronaut, Cambridge, MA 02139 USA

来源：

2022 IEEE HIGH PERFORMANCE EXTREME COMPUTING VIRTUAL CONFERENCE (HPEC) | 2022年

关键词：

D O I：

10.1109/HPEC55821.2022.9926410

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper presents an algorithm to accelerate the Jacobi iteration for solving linear systems of equations arising from structured problems on graphics processing units (GPUs). Acceleration is achieved by utilization of on-chip GPU shared memory via a domain decomposition procedure. In particular, the problem domain is partitioned into subdomains whose data is copied to the shared memory of each GPU block. Jacobi iterations are performed internally within each block s shared memory while avoiding expensive global memory accesses every iteration, resulting in a hierarchical algorithm (which takes advantage of the GPU memory hierarchy). We investigate the algorithm performance on the linear systems arising from the discretization of Poisson s equation in 1D and 2D, and observe an 8x speedup in convergence in the 1D problem and a nearly 6x speedup in 2D compared to a conventional GPU implementation of Jacobi iteration which only relies on global memory.

引用

页数：7

共 50 条

[1] Implementing Asynchronous Jacobi Iteration on GPUs
Tsai, Yu-Hsiang Mike
Nayak, Pratik
Chow, Edmond
Anzt, Hartwig
[J]. Proceedings of ScalAH 2022: 13th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Heterogeneous Systems, Held in conjunction with SC 2022: The International Conference for High Performance Computing, Networking, Storage and Analysis, 2022, : 1 - 9
[2] The preconditioned inverse iteration for hierarchical matrices
Benner, Peter
Mach, Thomas
[J]. NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS, 2013, 20 (01) : 150 - 166
[3] ON RELAXATION OF JACOBI ITERATION FOR CONSISTENT AND GENERALIZED MASS MATRICES
WATHEN, AJ
[J]. COMMUNICATIONS IN APPLIED NUMERICAL METHODS, 1991, 7 (02): : 93 - 102
[4] USING SHARED MEMORY AS A CACHE IN CELLULAR AUTOMATA WATER FLOW SIMULATIONS ON GPUs
Topa, Pawel
Locek, Pawel M.
[J]. COMPUTER SCIENCE-AGH, 2013, 14 (03): : 385 - 401
[5] A Study of the Memory Wall within the Jacobi Iteration Method
Sun, Siqi
Wang, Shan
Shen, Wenfeng
Xu, Weimin
Zheng, Yanheng
[J]. 2012 IEEE 14TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2012 IEEE 9TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (HPCC-ICESS), 2012, : 964 - 969
[6] Structured matrices and Newton's iteration: unified approach
Pan, VY
Rami, Y
Wang, XM
[J]. LINEAR ALGEBRA AND ITS APPLICATIONS, 2002, 343 : 233 - 265
[7] Efficient Batched Predecessor Search in Shared Memory on GPUs
Karsin, Ben
Casanova, Henri
Sitchinava, Nodari
[J]. 2015 IEEE 22ND INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2015, : 335 - 344
[8] OSM: Off-Chip Shared Memory for GPUs
Darabi, Sina
Yousefzadeh-Asl-Miandoab, Ehsan
Akbarzadeh, Negar
Falahati, Hajar
Lotfi-Kamran, Pejman
Sadrosadati, Mohammad
Sarbazi-Azad, Hamid
[J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (12) : 3415 - 3429
[9] Pragma Directed Shared Memory Centric Optimizations on GPUs
Li, Jing
Liu, Lei
Wu, Yuan
Liu, Xiang-Hua
Gao, Yi
Feng, Xiao-Bing
Wu, Cheng-Yong
[J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2016, 31 (02) : 235 - 252
[10] PARAMETRIZATION OF NEWTON ITERATION FOR COMPUTATIONS WITH STRUCTURED MATRICES AND APPLICATIONS
PAN, V
[J]. COMPUTERS & MATHEMATICS WITH APPLICATIONS, 1992, 24 (03) : 61 - 75

← 1 2 3 4 5 →