Cache-aware Sparse Matrix Formats for Kepler GPU

被引:0
|
作者
Nagasaka, Yusuke [1 ]
Nukada, Akira [1 ]
Matsuoka, Satoshi [1 ]
机构
[1] Tokyo Inst Technol, Meguro Ku, Tokyo 1528550, Japan
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Scientific simulations often require solving extremely large sparse linear equations, whose dominant kernel is sparse matrix vector multiplication. On modern many-core processors such as GPU or MIC, the operation has been known to pose significant bottleneck and thus would result in extremely poor efficiency, because of limited processor-to-memory bandwidth and low cache hit ratio due to random access to the input vector. Our family of new sparse matrix formats for many-core processors significantly increases the cache hit ratio and thus performance by segmenting the matrix along the columns, dividing the work among the many core up to the internal cache capacity, and aggregating the result later on. Performance studies show that we achieve up to x3.0 speedup in SpMV and x1.68 in multi-node CG, compared to the best vendor libraries and competing new formats that have been recently proposed such as SELL-C-sigma.
引用
收藏
页码:281 / 288
页数:8
相关论文
共 50 条
  • [1] Cache-Aware Matrix Polynomials
    Huber, Dominik
    Schreiber, Martin
    Yang, Dai
    Schulz, Martin
    [J]. COMPUTATIONAL SCIENCE - ICCS 2020, PT I, 2020, 12137 : 132 - 146
  • [2] Cache-Aware GPU Memory Scheduling Scheme for CT Back-Projection
    Zheng, Ziyi
    Mueller, Klaus
    [J]. 2010 IEEE NUCLEAR SCIENCE SYMPOSIUM CONFERENCE RECORD (NSS/MIC), 2010, : 2248 - 2251
  • [3] Cache-Aware Sampling Strategies for Texture-Based Ray Casting on GPU
    Wang, Junpeng
    Yang, Fei
    Cao, Yong
    [J]. 2014 IEEE 4TH SYMPOSIUM ON LARGE DATA ANALYSIS AND VISUALIZATION (LDAV), 2014, : 19 - 26
  • [4] Cache-Aware Source Coding
    Hanna, Osama A.
    Nafie, Mohammed
    El-Keyi, Amr
    [J]. IEEE COMMUNICATIONS LETTERS, 2018, 22 (06) : 1144 - 1147
  • [5] Cache-aware and cache-oblivious adaptive sorting
    Brodal, GS
    Fagerberg, R
    Moruz, G
    [J]. AUTOMATA, LANGUAGES AND PROGRAMMING, PROCEEDINGS, 2005, 3580 : 576 - 588
  • [6] Cache-aware algorithm for multidimensional correlations
    Altman, E. A.
    Vaseeva, T. V.
    Aleksandrov, A., V
    [J]. MECHANICAL SCIENCE AND TECHNOLOGY UPDATE (MSTU 2019), 2019, 1260
  • [7] CAGE: Cache-Aware Graphlet Enumeration
    Conte, Alessio
    Grossi, Roberto
    Rucci, Davide
    [J]. STRING PROCESSING AND INFORMATION RETRIEVAL, SPIRE 2023, 2023, 14240 : 129 - 142
  • [8] Cache-aware optimization of BAN applications
    Lei Ju
    Yun Liang
    Samarjit Chakraborty
    Tulika Mitra
    Abhik Roychoudhury
    [J]. Design Automation for Embedded Systems, 2009, 13 : 159 - 178
  • [9] Cache-Aware Iteration Space Partitioning
    Kejariwal, Arun
    Nicolau, Alexandru
    Banerjee, Utpal
    Veidenbaum, Alexander V.
    Polychronopoulos, Constantine D.
    [J]. PPOPP'08: PROCEEDINGS OF THE 2008 ACM SIGPLAN SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING, 2008, : 269 - 270
  • [10] Cache-aware optimization of BAN applications
    Ju, Lei
    Liang, Yun
    Chakraborty, Samarjit
    Mitra, Tulika
    Roychoudhury, Abhik
    [J]. DESIGN AUTOMATION FOR EMBEDDED SYSTEMS, 2009, 13 (03) : 159 - 178