Scalable Direct-Iterative Hybrid Solver for Sparse Matrices on Multi-Core and Vector Architectures

被引:3
|
作者
Ono, Kenji [1 ]
Kato, Toshihiro [2 ]
Ohshima, Satoshi [3 ]
Nanri, Takeshi [1 ]
机构
[1] Kyushu Univ, Fukuoka, Japan
[2] NEC Corp Ltd, Tokyo, Japan
[3] Nagoya Univ, Nagoya, Aichi, Japan
关键词
parallel cyclic reduction; cache bandwidth; line successive over-relaxation;
D O I
10.1145/3368474.3368484
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the present paper, we propose an efficient direct-iterative hybrid solver for sparse matrices that can derive the scalability of the latest multi-core, many-core, and vector architectures and examine the execution performance of the proposed SLOR-PCR method. We also present an efficient implementation of the PCR algorithm for SIMD and vector architectures so that it is easy to output instructions optimized by the compiler. The proposed hybrid method has high cache reusability, which is favorable for modern low B/F architecture because efficient use of the cache can mitigate the memory bandwidth limitation. The measured performance revealed that the SLOR-PCR solver showed excellent scalability up to 352 cores on the cc-NUMA environment, and the achieved performance was higher than that of the conventional Jacobi and Red-Black ordering method by a factor of 3.6 to 8.3 on the SIMD architecture. In addition, the maximum speedup in computation time was observed to be a factor of 6.3 on the cc-NUMA architecture with 352 cores.
引用
下载
收藏
页码:11 / 21
页数:11
相关论文
共 50 条
  • [21] Performance Optimization and Comparison of the Alternating Direction Implicit CFD Solver on Multi-core and Many-Core Architectures
    Deng Liang
    Zhao Dan
    Bai Hanli
    Wang Fang
    CHINESE JOURNAL OF ELECTRONICS, 2018, 27 (03) : 540 - 548
  • [22] Evaluating Multi-core and Many-core Architectures Through Parallelizing a High-order WENO Solver
    Deng, Liang
    Bai, Hanli
    Zhao, Dan
    Wang, Fang
    2016 IEEE TRUSTCOM/BIGDATASE/ISPA, 2016, : 2167 - 2174
  • [23] SPECTR: Scalable Parallel Short Read Error Correction on Multi-core and Many-core Architectures
    Xu, Kai
    Kobus, Robin
    Chan, Yuandong
    Gao, Ping
    Meng, Xiangxu
    Wei, Yanjie
    Schmidt, Bertil
    Liu, Weiguo
    PROCEEDINGS OF THE 47TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, 2018,
  • [24] An on-node scalable sparse incomplete LU factorization for a many-core iterative solver with Javelin
    Booth, Joshua Dennis
    Bolet, Gregory
    PARALLEL COMPUTING, 2020, 94-95 (94-95)
  • [25] Performance Optimization and Comparison of the Alternating Direction Implicit CFD Solver on Multi-core and Many-Core Architectures
    DENG Liang
    ZHAO Dan
    BAI Hanli
    WANG Fang
    Chinese Journal of Electronics, 2018, 27 (03) : 540 - 548
  • [26] Evaluating Multi-core and Many-core Architectures Through Accelerating an Alternating Direction Implicit CFD Solver
    Deng, Liang
    Fang, Jianbin
    Wang, Fang
    Bai, Hanli
    2016 15TH INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING (ISPDC), 2016, : 1 - 10
  • [27] Parallel B&B Algorithm for Hybrid Multi-core/GPU Architectures
    Bendjoudi, A.
    Chekini, M.
    Gharbi, M.
    Mehdi, M.
    Benatchba, K.
    Sitayeb-Benbouzid, F.
    Melab, N.
    2013 IEEE 15TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2013 IEEE INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (HPCC_EUC), 2013, : 914 - 921
  • [28] SpaceCubeX: A Framework for Evaluating Hybrid Multi-Core CPU/FPGA/DSP Architectures
    Schmidt, Andrew G.
    Weisz, Gabriel
    French, Matthew
    Flatley, Thomas
    Villalpando, Carlos Y.
    2017 IEEE AEROSPACE CONFERENCE, 2017,
  • [29] Restricting Writes for Energy-Efficient Hybrid Cache in Multi-Core Architectures
    Agarwal, Sukarn
    Kapoor, Hemangee K.
    2016 IFIP/IEEE INTERNATIONAL CONFERENCE ON VERY LARGE SCALE INTEGRATION (VLSI-SOC), 2016,
  • [30] An Efficient Hybrid Synchronization Technique for Scalable Multi-Core Instruction Set Simulations
    Zeng, Bo-Han
    Tsay, Ren-Song
    Wang, Ting-Chi
    2013 18TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2013, : 588 - 593