GAMER-2: a GPU-accelerated adaptive mesh refinement code - accuracy, performance, and scalability

被引:50
|
作者
Schive, Hsi-Yu [1 ,2 ]
ZuHone, John A. [3 ]
Goldbaum, Nathan J. [1 ]
Turk, Matthew J. [4 ,5 ]
Gaspari, Massimo [6 ]
Cheng, Chin-Yu [7 ]
机构
[1] Univ Illinois, Natl Ctr Supercomp Applicat, Urbana, IL 61820 USA
[2] Natl Taiwan Univ, Inst Astrophys, Taipei 10617, Taiwan
[3] Harvard Smithsonian Ctr Astrophys, 60 Garden St, Cambridge, MA 02138 USA
[4] Univ Illinois, Sch Informat Sci, Urbana, IL 61820 USA
[5] Univ Illinois, Dept Astron, Urbana, IL 61820 USA
[6] Princeton Univ, Dept Astrophys Sci, 4 Ivy Lane, Princeton, NJ 08544 USA
[7] Univ Illinois, Dept Elect & Comp Engn, Urbana, IL 61820 USA
基金
美国国家科学基金会;
关键词
methods:; numerical; CONSTRAINED TRANSPORT; GALAXY CLUSTERS; SIMULATION; HYDRODYNAMICS; TURBULENCE; SOLVER; TREE; FLOW; SCHEMES; UPWIND;
D O I
10.1093/mnras/sty2586
中图分类号
P1 [天文学];
学科分类号
0704 ;
摘要
We present GAMER-2, a GPU-accelerated adaptive mesh refinement (AMR) code for astro-physics. It provides a rich set of features, including adaptive time-stepping, several hydrodynamic schemes, magnetohydrodynamics, self-gravity, particles, star formation, chemistry, and radiative processes with GRACKLE, data analysis with YT, and memory pool for efficient object allocation. GAMER-2 is fully bitwise reproducible. For the performance optimization, it adopts hybrid OpenMP/MPI/GPU parallelization and utilizes overlapping CPU computation, GPU computation, and CPU-GPU communication. Load balancing is achieved using a Hilbert space-filling curve on a level-by-level basis without the need to duplicate the entire AMR hierarchy on each MPI process. To provide convincing demonstrations of the accuracy and performance of GAMER-2, we directly compare with ENZO on isolated disc galaxy simulations and with FLASH on galaxy cluster merger simulations. We show that the physical results obtained by different codes are in very good agreement, and GAMER-2 outperforms ENZO and FLASH by nearly one and two orders of magnitude, respectively, on the Blue Waters supercomputers using 1-256 nodes. More importantly, GAMER-2 exhibits similar or even better parallel scalability compared to the other two codes. We also demonstrate good weak and strong scaling using up to 4096 GPUs and 65 536 CPU cores, and achieve a uniform resolution as high as 10 2403 cells. Furthermore, GAMER-2 can be adopted as an AMR + GPUs framework and has been extensively used for the wave dark matter simulations. GAMER-2 is open source (available at https://github.com/gamer-project/gamer) and new contributions are welcome.
引用
下载
收藏
页码:4815 / 4840
页数:26
相关论文
共 26 条
  • [1] A GPU-accelerated adaptive mesh refinement for immersed boundary methods
    Ji, Hua
    Lien, Fue-Sang
    Zhang, Fan
    COMPUTERS & FLUIDS, 2015, 118 : 131 - 147
  • [2] GAMER: A GRAPHIC PROCESSING UNIT ACCELERATED ADAPTIVE-MESH-REFINEMENT CODE FOR ASTROPHYSICS
    Schive, Hsi-Yu
    Tsai, Yu-Chih
    Chiueh, Tzihong
    ASTROPHYSICAL JOURNAL SUPPLEMENT SERIES, 2010, 186 (02): : 457 - 484
  • [3] An adaptive mesh, GPU-accelerated, and error minimized special relativistic hydrodynamics code
    Tseng, Po-Hsun
    Schive, Hsi-Yu
    Chiueh, Tzihong
    MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2021, 504 (03) : 3298 - 3315
  • [4] H-AMR: A New GPU-accelerated GRMHD Code for Exascale Computing with 3D Adaptive Mesh Refinement and Local Adaptive Time Stepping
    Liska, M. T. P.
    Chatterjee, K.
    Issa, D.
    Yoon, D.
    Kaaz, N.
    Tchekhovskoy, A.
    van Eijnatten, D.
    Musoke, G.
    Hesp, C.
    Rohoza, V.
    Markoff, S.
    Ingram, A.
    van der Klis, M.
    ASTROPHYSICAL JOURNAL SUPPLEMENT SERIES, 2022, 263 (02):
  • [5] A GPU-Accelerated Fast Multipole Method for GROMACS: Performance and Accuracy
    Kohnke, Bartosz
    Kutzner, Carsten
    Grubmueller, Helmut
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2020, 16 (11) : 6938 - 6949
  • [6] A GPU-Accelerated Fast Multipole Method for Gromacs. Performance and Accuracy
    Kohnke, Bartosz
    Kutzner, Carsten
    Grubmuller, Helmut
    BIOPHYSICAL JOURNAL, 2021, 120 (03) : 177A - 177A
  • [7] Volume Visualization Using Adaptive Tetrahedral Mesh with GPU-Accelerated Fast Cell Search
    Kimura, Akinori
    Tanaka, Satoshi
    2015 IEEE NUCLEAR SCIENCE SYMPOSIUM AND MEDICAL IMAGING CONFERENCE (NSS/MIC), 2015,
  • [8] Parallel GPU-accelerated adaptive mesh refinement on two-dimensional phase-field lattice Boltzmann simulation of dendrite growth
    Sakane, Shinji
    Aoki, Takayuki
    Takaki, Tomohiro
    COMPUTATIONAL MATERIALS SCIENCE, 2022, 211
  • [9] Evaluating Accuracy and Performance of GPU-Accelerated Random Walk Computation on Heterogeneous Networks
    Gong, Jiayu
    Cai, Lizhi
    Shen, Yuxin
    2016 17TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD), 2016, : 541 - 545
  • [10] GPU accelerated cell-based adaptive mesh refinement on unstructured quadrilateral grid
    Luo, Xisheng
    Wang, Luying
    Ran, Wei
    Qin, Fenghua
    COMPUTER PHYSICS COMMUNICATIONS, 2016, 207 : 114 - 122