Performance evaluation of the sparse matrix-vector multiplication on modern architectures

被引:58
|
作者
Goumas, Georgios [1 ]
Kourtis, Kornilios [1 ]
Anastopoulos, Nikos [1 ]
Karakasis, Vasileios [1 ]
Koziris, Nectarios [1 ]
机构
[1] Natl Tech Univ Athens, Comp Syst Lab, Sch Elect & Comp Engn, Zografos 15780, Greece
来源
JOURNAL OF SUPERCOMPUTING | 2009年 / 50卷 / 01期
关键词
Sparse matrix-vector multiplication; Multicore architectures; Scientific applications; Performance evaluation;
D O I
10.1007/s11227-008-0251-8
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we revisit the performance issues of the widely used sparse matrix-vector multiplication (SpMxV) kernel on modern microarchitectures. Previous scientific work reports a number of different factors that may significantly reduce performance. However, the interaction of these factors with the underlying architectural characteristics is not clearly understood, a fact that may lead to misguided, and thus unsuccessful attempts for optimization. In order to gain an insight into the details of SpMxV performance, we conduct a suite of experiments on a rich set of matrices for three different commodity hardware platforms. In addition, we investigate the parallel version of the kernel and report on the corresponding performance results and their relation to each architecture's specific multithreaded configuration. Based on our experiments, we extract useful conclusions that can serve as guidelines for the optimization process of both single and multithreaded versions of the kernel.
引用
收藏
页码:36 / 77
页数:42
相关论文
共 50 条
  • [21] Node aware sparse matrix-vector multiplication
    Bienz, Amanda
    Gropp, William D.
    Olson, Luke N.
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2019, 130 : 166 - 178
  • [22] STRUCTURED SPARSE MATRIX-VECTOR MULTIPLICATION ON A MASPAR
    DEHN, T
    EIERMANN, M
    GIEBERMANN, K
    SPERLING, V
    ZEITSCHRIFT FUR ANGEWANDTE MATHEMATIK UND MECHANIK, 1994, 74 (06): : T534 - T538
  • [23] Sparse matrix-vector multiplication -: Final solution?
    Simecek, Ivan
    Tvrdik, Pavel
    PARALLEL PROCESSING AND APPLIED MATHEMATICS, 2008, 4967 : 156 - 165
  • [24] Optimization of Block Sparse Matrix-Vector Multiplication on Shared-Memory Parallel Architectures
    Eberhardt, Ryan
    Hoemmen, Mark
    2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2016, : 663 - 672
  • [25] A hybrid format for better performance of sparse matrix-vector multiplication on a GPU
    Guo, Dahai
    Gropp, William
    Olson, Luke N.
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2016, 30 (01): : 103 - 120
  • [26] Scale-Free Sparse Matrix-Vector Multiplication on Many-Core Architectures
    Liang, Yun
    Tang, Wai Teng
    Zhao, Ruizhe
    Lu, Mian
    Huynh Phung Huynh
    Goh, Rick Siow Mong
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2017, 36 (12) : 2106 - 2119
  • [27] Adaptive Optimization of Sparse Matrix-Vector Multiplication on Emerging Many-Core Architectures
    Chen, Shizhao
    Fang, Jianbin
    Chen, Donglin
    Xu, Chuanfu
    Wang, Zheng
    IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2018, : 649 - 658
  • [28] Breaking the performance bottleneck of sparse matrix-vector multiplication on SIMD processors
    Zhang, Kai
    Chen, Shuming
    Wang, Yaohua
    Wan, Jianghua
    IEICE ELECTRONICS EXPRESS, 2013, 10 (09):
  • [29] Evaluating the Performance Impact of Communication Imbalance in Sparse Matrix-Vector Multiplication
    Utrera, Gladys
    Gil, Marisa
    Martorell, Xavier
    23RD EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2015), 2015, : 321 - 328
  • [30] Adaptive sparse matrix representation for efficient matrix-vector multiplication
    Zardoshti, Pantea
    Khunjush, Farshad
    Sarbazi-Azad, Hamid
    JOURNAL OF SUPERCOMPUTING, 2016, 72 (09): : 3366 - 3386