A Memory Efficient Algorithm for Adaptive Multidimensional Integration with Multiple GPUs

被引:0
|
作者
Arumugam, Kamesh [1 ,2 ]
Godunov, Alexander [2 ,3 ]
Ranjan, Desh [1 ,2 ]
Terzic, Balsa [2 ,4 ]
Zuhair, Mohammad [1 ,2 ]
机构
[1] Old Dominion Univ, Dept Comp Sci, Norfolk, VA 23529 USA
[2] Old Dominion Univ, Ctr Accelerator Sci, Norfolk, VA 23529 USA
[3] Old Dominion Univ, Dept Phys, Norfolk, VA 23529 USA
[4] Ctr Adv Studies Accelerators, Jefferson Lab, Newport News, VA 23606 USA
关键词
QUADRATURE;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We present a memory-efficient algorithm and its implementation for solving multidimensional numerical integration on a cluster of compute nodes with multiple GPU devices per node. The effective use of shared memory is important for improving the performance on GPUs, because of the bandwidth limitation of the global memory. The best known sequential algorithm for multidimensional numerical integration CUHRE uses a large dynamic heap data structure which is accessed frequently. Devising a GPU algorithm that caches a part of this data structure in the shared memory so as to minimizes global memory access is a challenging task. The algorithm presented here addresses this problem. Furthermore we propose a technique to scale this algorithm to multiple GPU devices. The algorithm was implemented on a cluster of Intel (R) Xeon (R) CPU X5650 compute nodes with 4 Tesla M2090 GPU devices per node. We observed a speedup of up to 240 on a single GPU device as compared to a speedup of 70 when memory optimization was not used. On a cluster of 6 nodes (24 GPU devices) we were able to obtain a speedup of up to 3250. All speedups here are with reference to the sequential implementation running on the compute node.
引用
收藏
页码:169 / 175
页数:7
相关论文
共 50 条
  • [1] An Efficient Deterministic Parallel Algorithm for Adaptive Multidimensional Numerical Integration on GPUs
    Arumugam, Kamesh
    Godunov, Alexander
    Ranjan, Desh
    Terzic, Balsa
    Zubair, Mohammad
    [J]. 2013 42ND ANNUAL INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2013, : 486 - 491
  • [2] NEW ALGORITHM FOR ADAPTIVE MULTIDIMENSIONAL INTEGRATION
    LEPAGE, GP
    [J]. JOURNAL OF COMPUTATIONAL PHYSICS, 1978, 27 (02) : 192 - 203
  • [3] Efficient Parallel UPGMA algorithm Based on Multiple GPUs
    Hung, Che-Lun
    Wu, Fu-Che
    Lin, Chun-Yuan
    Chan, Yu-Wei
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 870 - 873
  • [4] An Adaptive and Memory Efficient Algorithm for Genotype Imputation
    Kang, Hyun Min
    Zaitlen, Noah A.
    Han, Buhm
    Eskin, Eleazar
    [J]. RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY, PROCEEDINGS, 2009, 5541 : 482 - +
  • [5] Efficient assignment algorithm for mapping multidimensional signals into the physical memory
    Luican, Ilie I.
    Zhu, Hongwei
    Balasa, Florin
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 1409 - +
  • [6] A Double Adaptive Algorithm for Multidimensional Integration on Multicore Based HPC Systems
    Giuliano Laccetti
    Marco Lapegna
    Valeria Mele
    Diego Romano
    Almerico Murli
    [J]. International Journal of Parallel Programming, 2012, 40 : 397 - 409
  • [7] A Double Adaptive Algorithm for Multidimensional Integration on Multicore Based HPC Systems
    Laccetti, Giuliano
    Lapegna, Marco
    Mele, Valeria
    Romano, Diego
    Murli, Almerico
    [J]. INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2012, 40 (04) : 397 - 409
  • [8] Adaptive Security Support for Heterogeneous Memory on GPUs
    Yuan, Shougang
    Awad, Amro
    Yudha, Ardhi Wiratama Baskara
    Solihin, Yan
    Zhou, Huiyang
    [J]. 2022 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2022), 2022, : 213 - 228
  • [9] A Memory-Access-Efficient Implementation for Computing the Approximate String Matching Algorithm on GPUs
    Nunes, Lucas Saad Nogueira
    Bordim, Jacir Luiz
    Ito, Yasuaki
    Nakano, Koji
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (12) : 2995 - 3003
  • [10] Register Efficient Dynamic Memory Allocator for GPUs
    Vinkler, M.
    Havran, V.
    [J]. COMPUTER GRAPHICS FORUM, 2015, 34 (08) : 143 - 154