Efficient Parallel Multigrid Method on Intel Xeon Phi Clusters

被引:0
|
作者
Nakajima, Kengo [1 ]
Gerofi, Balazs [2 ]
Ishikawa, Yutaka [2 ]
Horikoshi, Masashi [3 ]
机构
[1] Univ Tokyo, Tokyo, Japan
[2] RIKEN, R CCS, Kobe, Hyogo, Japan
[3] Intel Corp, Tokyo, Japan
来源
PROCEEDINGS OF INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING IN ASIA-PACIFIC REGION WORKSHOPS (HPC ASIA 2021 WORKSHOPS) | 2020年
关键词
parallel iterative solvers; multigrid; SELL-C-sigma; light weight kernel;
D O I
10.1145/3440722.3440882
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The parallel multigrid method is expected to play an important role in scientific computing on exa-scale supercomputer systems for solving large-scale linear equations with sparse matrices. Because solving sparse linear systems is a very memory-bound process, efficient method for storage of coefficient matrices is a crucial issue. In the previous works, authors implemented sliced ELL method to parallel conjugate gradient solvers with multigrid preconditioning (MGCG) for the application on 3D groundwater flow through heterogeneous porous media (pGW3D-FVM), and excellent performance has been obtained on large-scale multicore/manycore clusters. In the present work, authors introduced SELL-C-sigma to the MGCG solver, and evaluated the performance of the solver with various types of OpenMP/MPI hybrid parallel programing models on the Oakforest-PACS (OFP) system at JCAHPC using up to 1,024 nodes of Intel Xeon Phi. Because SELL-C-sigma is suitable for wide-SIMD architecture, such as Xeon Phi, improvement of the performance over the sliced ELL was more than 20%. This is one of the first examples of SELL-C-sigma applied to forward/backward substitutions in ILU-type smoother of multigrid solver. Furthermore, effects of IHK/McKernel has been investigated, and it achieved 11% improvement on 1,024 nodes.
引用
收藏
页码:46 / 49
页数:4
相关论文
共 50 条
  • [1] Parallel Pairwise Correlation Computation On Intel Xeon Phi Clusters
    Liu, Yongchao
    Pan, Tony
    Aluru, Srinivas
    PROCEEDINGS OF 28TH IEEE INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING, (SBAC-PAD 2016), 2016, : 141 - 149
  • [2] Benchmarking Parallel Chess Search in Stockfish on Intel Xeon and Intel Xeon Phi Processors
    Czarnul, Pawel
    COMPUTATIONAL SCIENCE - ICCS 2018, PT III, 2018, 10862 : 457 - 464
  • [3] Efficient Array Slicing on the Intel Xeon Phi Coprocessor
    Bjornseth, Benjamin Andreassen
    Meyer, Jan Christian
    Natvig, Lasse
    ARRAY'17: PROCEEDINGS OF THE 4TH ACM SIGPLAN INTERNATIONAL WORKSHOP ON LIBRARIES, LANGUAGES, AND COMPILERS FOR ARRAY PROGRAMMING, 2017, : 40 - 47
  • [4] Performance Optimization of OpenFOAM* on Clusters of Intel® Xeon Phi™ Processors
    Ojha, Ravi
    Pawar, Prasad
    Gupta, Sonia
    Klemm, Michael
    Nambiar, Manoj
    2017 IEEE 24TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING WORKSHOPS (HIPCW), 2017, : 51 - 59
  • [5] An Efficient Optimization of Hll Method for the Second Generation of Intel Xeon Phi Processor
    Kulikov I.M.
    Chernykh I.G.
    Glinskiy B.M.
    Protasov V.A.
    Lobachevskii Journal of Mathematics, 2018, 39 (4) : 543 - 551
  • [6] Performance Evaluation of NAS Parallel Benchmarks on Intel® Xeon Phi™
    Ramachandran, Arunmoezhi
    Vienne, Jerome
    Van der Wijngaart, Rob
    Koesterke, Lars
    Sharapov, Ilya
    2013 42ND ANNUAL INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2013, : 736 - 743
  • [7] Tuning up TVD HOPMOC method on Intel MIC Xeon Phi Architectures with Intel Parallel Studio Tools
    Cabral, Frederico L.
    Osthoff, Carla
    Costa, Gabriel P.
    Brandao, Diego
    Kischinhevsky, Mauricio
    Gonzaga de Oliveira, Sanderson L.
    2017 INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING WORKSHOPS (SBAC-PADW), 2017, : 19 - 24
  • [8] A New Parallel Intel Xeon Phi Hydrodynamics Code for Massively Parallel Supercomputers
    Kulikov I.M.
    Chernykh I.G.
    Tutukov A.V.
    Lobachevskii Journal of Mathematics, 2018, 39 (9) : 1207 - 1216
  • [9] Efficient irregular wavefront propagation algorithms on Intel® Xeon Phi™
    Gomes, Jeremias M.
    Teodoro, George
    de Melo, Alba
    Kong, Jun
    Kurc, Tahsin
    Saltz, Joel H.
    2015 27TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 2015, : 25 - 32
  • [10] INTELLIGENT PARALLEL COMPUTER WITH INTEL XEON PHI PROCESSORS OF NEW GENERATION
    Khimich, O. M.
    Mova, V., I
    Nikolaichuk, O. O.
    Popov, O., V
    Chistjakova, T., V
    Tulchinsky, V. G.
    SCIENCE AND INNOVATION, 2018, 14 (06): : 61 - 72