Scalable Direct-Iterative Hybrid Solver for Sparse Matrices on Multi-Core and Vector Architectures

被引：3

作者：

Ono, Kenji ^{[1
]}

Kato, Toshihiro ^{[2
]}

Ohshima, Satoshi ^{[3
]}

Nanri, Takeshi ^{[1
]}

机构：

[1] Kyushu Univ, Fukuoka, Japan

[2] NEC Corp Ltd, Tokyo, Japan

[3] Nagoya Univ, Nagoya, Aichi, Japan

来源：

PROCEEDINGS OF INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING IN ASIA-PACIFIC REGION (HPC ASIA 2020) | 2020年

关键词：

parallel cyclic reduction; cache bandwidth; line successive over-relaxation;

D O I：

10.1145/3368474.3368484

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In the present paper, we propose an efficient direct-iterative hybrid solver for sparse matrices that can derive the scalability of the latest multi-core, many-core, and vector architectures and examine the execution performance of the proposed SLOR-PCR method. We also present an efficient implementation of the PCR algorithm for SIMD and vector architectures so that it is easy to output instructions optimized by the compiler. The proposed hybrid method has high cache reusability, which is favorable for modern low B/F architecture because efficient use of the cache can mitigate the memory bandwidth limitation. The measured performance revealed that the SLOR-PCR solver showed excellent scalability up to 352 cores on the cc-NUMA environment, and the achieved performance was higher than that of the conventional Jacobi and Red-Black ordering method by a factor of 3.6 to 8.3 on the SIMD architecture. In addition, the maximum speedup in computation time was observed to be a factor of 6.3 on the cc-NUMA architecture with 352 cores.

引用

下载

页码：11 / 21

页数：11

共 50 条

[21] Performance Optimization and Comparison of the Alternating Direction Implicit CFD Solver on Multi-core and Many-Core Architectures
Deng Liang
Zhao Dan
Bai Hanli
Wang Fang
CHINESE JOURNAL OF ELECTRONICS, 2018, 27 (03) : 540 - 548
[22] Evaluating Multi-core and Many-core Architectures Through Parallelizing a High-order WENO Solver
Deng, Liang
Bai, Hanli
Zhao, Dan
Wang, Fang
2016 IEEE TRUSTCOM/BIGDATASE/ISPA, 2016, : 2167 - 2174
[23] SPECTR: Scalable Parallel Short Read Error Correction on Multi-core and Many-core Architectures
Xu, Kai
Kobus, Robin
Chan, Yuandong
Gao, Ping
Meng, Xiangxu
Wei, Yanjie
Schmidt, Bertil
Liu, Weiguo
PROCEEDINGS OF THE 47TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, 2018,
[24] An on-node scalable sparse incomplete LU factorization for a many-core iterative solver with Javelin
Booth, Joshua Dennis
Bolet, Gregory
PARALLEL COMPUTING, 2020, 94-95 (94-95)
[25] Performance Optimization and Comparison of the Alternating Direction Implicit CFD Solver on Multi-core and Many-Core Architectures
DENG Liang
ZHAO Dan
BAI Hanli
WANG Fang
Chinese Journal of Electronics, 2018, 27 (03) : 540 - 548
[26] Evaluating Multi-core and Many-core Architectures Through Accelerating an Alternating Direction Implicit CFD Solver
Deng, Liang
Fang, Jianbin
Wang, Fang
Bai, Hanli
2016 15TH INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING (ISPDC), 2016, : 1 - 10
[27] Parallel B&B Algorithm for Hybrid Multi-core/GPU Architectures
Bendjoudi, A.
Chekini, M.
Gharbi, M.
Mehdi, M.
Benatchba, K.
Sitayeb-Benbouzid, F.
Melab, N.
2013 IEEE 15TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2013 IEEE INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (HPCC_EUC), 2013, : 914 - 921
[28] SpaceCubeX: A Framework for Evaluating Hybrid Multi-Core CPU/FPGA/DSP Architectures
Schmidt, Andrew G.
Weisz, Gabriel
French, Matthew
Flatley, Thomas
Villalpando, Carlos Y.
2017 IEEE AEROSPACE CONFERENCE, 2017,
[29] Restricting Writes for Energy-Efficient Hybrid Cache in Multi-Core Architectures
Agarwal, Sukarn
Kapoor, Hemangee K.
2016 IFIP/IEEE INTERNATIONAL CONFERENCE ON VERY LARGE SCALE INTEGRATION (VLSI-SOC), 2016,
[30] An Efficient Hybrid Synchronization Technique for Scalable Multi-Core Instruction Set Simulations
Zeng, Bo-Han
Tsay, Ren-Song
Wang, Ting-Chi
2013 18TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2013, : 588 - 593

← 1 2 3 4 5 →