A Memory Efficient Algorithm for Adaptive Multidimensional Integration with Multiple GPUs

被引：0

作者：

Arumugam, Kamesh ^{[1
,2
]}

Godunov, Alexander ^{[2
,3
]}

Ranjan, Desh ^{[1
,2
]}

Terzic, Balsa ^{[2
,4
]}

Zuhair, Mohammad ^{[1
,2
]}

机构：

[1] Old Dominion Univ, Dept Comp Sci, Norfolk, VA 23529 USA

[2] Old Dominion Univ, Ctr Accelerator Sci, Norfolk, VA 23529 USA

[3] Old Dominion Univ, Dept Phys, Norfolk, VA 23529 USA

[4] Ctr Adv Studies Accelerators, Jefferson Lab, Newport News, VA 23606 USA

来源：

2013 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC) | 2013年

关键词：

QUADRATURE;

D O I：

暂无

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

We present a memory-efficient algorithm and its implementation for solving multidimensional numerical integration on a cluster of compute nodes with multiple GPU devices per node. The effective use of shared memory is important for improving the performance on GPUs, because of the bandwidth limitation of the global memory. The best known sequential algorithm for multidimensional numerical integration CUHRE uses a large dynamic heap data structure which is accessed frequently. Devising a GPU algorithm that caches a part of this data structure in the shared memory so as to minimizes global memory access is a challenging task. The algorithm presented here addresses this problem. Furthermore we propose a technique to scale this algorithm to multiple GPU devices. The algorithm was implemented on a cluster of Intel (R) Xeon (R) CPU X5650 compute nodes with 4 Tesla M2090 GPU devices per node. We observed a speedup of up to 240 on a single GPU device as compared to a speedup of 70 when memory optimization was not used. On a cluster of 6 nodes (24 GPU devices) we were able to obtain a speedup of up to 3250. All speedups here are with reference to the sequential implementation running on the compute node.

引用

页码：169 / 175

页数：7

共 50 条

[1] An Efficient Deterministic Parallel Algorithm for Adaptive Multidimensional Numerical Integration on GPUs
Arumugam, Kamesh
Godunov, Alexander
Ranjan, Desh
Terzic, Balsa
Zubair, Mohammad
[J]. 2013 42ND ANNUAL INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2013, : 486 - 491
[2] NEW ALGORITHM FOR ADAPTIVE MULTIDIMENSIONAL INTEGRATION
LEPAGE, GP
[J]. JOURNAL OF COMPUTATIONAL PHYSICS, 1978, 27 (02) : 192 - 203
[3] Efficient Parallel UPGMA algorithm Based on Multiple GPUs
Hung, Che-Lun
Wu, Fu-Che
Lin, Chun-Yuan
Chan, Yu-Wei
[J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 870 - 873
[4] An Adaptive and Memory Efficient Algorithm for Genotype Imputation
Kang, Hyun Min
Zaitlen, Noah A.
Han, Buhm
Eskin, Eleazar
[J]. RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY, PROCEEDINGS, 2009, 5541 : 482 - +
[5] Efficient assignment algorithm for mapping multidimensional signals into the physical memory
Luican, Ilie I.
Zhu, Hongwei
Balasa, Florin
[J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 1409 - +
[6] A Double Adaptive Algorithm for Multidimensional Integration on Multicore Based HPC Systems
Giuliano Laccetti
Marco Lapegna
Valeria Mele
Diego Romano
Almerico Murli
[J]. International Journal of Parallel Programming, 2012, 40 : 397 - 409
[7] A Double Adaptive Algorithm for Multidimensional Integration on Multicore Based HPC Systems
Laccetti, Giuliano
Lapegna, Marco
Mele, Valeria
Romano, Diego
Murli, Almerico
[J]. INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2012, 40 (04) : 397 - 409
[8] Adaptive Security Support for Heterogeneous Memory on GPUs
Yuan, Shougang
Awad, Amro
Yudha, Ardhi Wiratama Baskara
Solihin, Yan
Zhou, Huiyang
[J]. 2022 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2022), 2022, : 213 - 228
[9] A Memory-Access-Efficient Implementation for Computing the Approximate String Matching Algorithm on GPUs
Nunes, Lucas Saad Nogueira
Bordim, Jacir Luiz
Ito, Yasuaki
Nakano, Koji
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (12) : 2995 - 3003
[10] Register Efficient Dynamic Memory Allocator for GPUs
Vinkler, M.
Havran, V.
[J]. COMPUTER GRAPHICS FORUM, 2015, 34 (08) : 143 - 154

← 1 2 3 4 5 →