Accelerating matrix-centric graph processing on GPUs through bit-level optimizations

被引:0
|
作者
Chen, Jou-An [1 ]
Sung, Hsin-Hsuan [1 ]
Shen, Xipeng [1 ]
Tallent, Nathan [2 ]
Barker, Kevin [2 ]
Li, Ang [2 ]
机构
[1] North Carolina State Univ, Dept Comp Sci, Raleigh, NC 27695 USA
[2] Pacific Northwest Natl Lab, Richland, WA USA
基金
美国国家科学基金会;
关键词
GraphBLAS; Bit manipulation; GPU; Sparse matrix; Deep reinforcement learning;
D O I
10.1016/j.jpdc.2023.02.013
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Even though it is well known that binary values are common in graph applications (e.g., adjacency matrix), how to leverage the phenomenon for efficiency has not yet been adequately explored. This paper presents a systematic study on how to unlock the potential of the bit-level optimizations of graph computations that involve binary values. It proposes a two-level representation named Bit-Block Compressed Sparse Row (B2SR) and presents a series of optimizations to the graph operations on B2SR by the intrinsics of modern GPUs. It additionally introduces Deep Reinforcement Learning (DRL) as an efficient way to best configure the bit-level optimizations on the fly. The DQN-based adaptive tile size selector with dedicated model training can reach 68% prediction accuracy. Evaluations on the NVIDIA Pascal and Volta GPUs show that the optimizations bring up to 40x and 6555x for essential GraphBLAS kernels SpMV and SpGEMM, respectively, accelerating GraphBLAS-based BFS by up to 433x, SSSP, PR, and CC 35x, and TC 52x. (c) 2023 Elsevier Inc. All rights reserved.
引用
收藏
页码:53 / 67
页数:15
相关论文
共 46 条
  • [1] Bit-GraphBLAS: Bit-Level Optimizations of Matrix-Centric Graph Processing on GPU
    Chen, Jou-An
    Sung, Hsin-Hsuan
    Shen, Xipeng
    Tallent, Nathan
    Barker, Kevin
    Li, Ang
    2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2022), 2022, : 515 - 525
  • [2] Accelerating Matrix Processing with GPUs
    Malaya, Nicholas
    Che, Shuai
    Greathouse, Joseph L.
    van Oostrum, Rene
    Schulte, Michael J.
    2017 IEEE 24TH SYMPOSIUM ON COMPUTER ARITHMETIC (ARITH), 2017, : 139 - 141
  • [3] GraphPEG: Accelerating Graph Processing on GPUs
    Lu, Yashuai
    Guo, Hui
    Huang, Libo
    Yu, Qi
    Shen, Li
    Xiao, Nong
    Wang, Zhiying
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2021, 18 (03)
  • [4] Accelerating Unstructured Graph Data Processing on GPUs
    Pan, Xiaohui
    2ND INTERNATIONAL CONFERENCE ON SIMULATION AND MODELING METHODOLOGIES, TECHNOLOGIES AND APPLICATIONS (SMTA 2015), 2015, : 29 - 33
  • [5] An improved architecture for bit-level matrix multiplication
    Grover, RS
    Shang, WJ
    Li, Q
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS I-V, 2000, : 2257 - 2264
  • [6] Reduce, Reuse, and Adapt: Accelerating Graph Processing on GPUs
    Ullas, A.
    Nasre, Rupesh
    Govindarajan, R.
    2023 IEEE 30TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS, HIPC 2023, 2023, : 335 - 346
  • [7] DESIGN OF BIT-LEVEL SYSTOLIC ARRAYS WITH DEPENDENCE GRAPH
    LIU, CM
    JEN, CW
    SYSTOLIC ARRAY PROCESSORS, 1989, : 439 - 448
  • [8] Accelerating Complex Event Processing through GPUs
    Rodrigo, Prabodha Srimal
    Bandara, H. M. N. Dilum
    Perera, Srinath
    2015 IEEE 22ND INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2015, : 325 - 334
  • [9] Bit-Beading: Stringing bit-level MAC results for Accelerating Neural Networks
    Anwar, Zeeshan
    Longchar, Imlijungla
    Kapoor, Hemangee K.
    PROCEEDINGS OF THE 37TH INTERNATIONAL CONFERENCE ON VLSI DESIGN, VLSID 2024 AND 23RD INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS, ES 2024, 2024, : 216 - 221
  • [10] Bit Fusion: Bit-Level Dynamically Composable Architecture for Accelerating Deep Neural Networks
    Sharma, Hardik
    Park, Jongse
    Suda, Naveen
    Lai, Liangzhen
    Chau, Benson
    Chandra, Vikas
    Esmaeilzadeh, Hadi
    2018 ACM/IEEE 45TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2018, : 764 - 775