SparseAdapt: Runtime Control for Sparse Linear Algebra on a Reconfigurable Accelerator

被引:9
|
作者
Pal, Subhankar [1 ]
Amarnath, Aporva [1 ]
Feng, Siying [1 ]
O'Boyle, Michael [2 ]
Dreslinski, Ronald [1 ]
Dubach, Christophe [3 ]
机构
[1] Univ Michigan, Ann Arbor, MI 48109 USA
[2] Univ Edinburgh, Edinburgh, Midlothian, Scotland
[3] McGill Univ, Montreal, PQ, Canada
基金
加拿大自然科学与工程研究理事会; 英国工程与自然科学研究理事会;
关键词
reconfigurable accelerators; sparse linear algebra; energy-efficient computing; machine learning; predictive models;
D O I
10.1145/3466752.3480134
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Dynamic adaptation is a post-silicon optimization technique that adapts the hardware to workload phases. However, current adaptive approaches are oblivious to implicit phases that arise from operating on irregular data, such as sparse linear algebra operations. Implicit phases are short-lived and do not exhibit consistent behavior throughout execution. This calls for a high-accuracy, low overhead runtime mechanism for adaptation at a fine granularity. Moreover, adopting such techniques for reconfigurable manycore hardware, such as coarse-grained reconfigurable architectures (CGRAs), adds complexity due to synchronization and resource contention. We propose a lightweight machine learning-based adaptive framework called SparseAdapt. It enables low-overhead control of configuration parameters to tailor the hardware to both implicit (datadriven) and explicit (code-driven) phase changes. SparseAdapt is implemented within the runtime of a recently-proposed CGRA called Transmuter, which has been shown to deliver high performance for irregular sparse operations. SparseAdapt can adapt configuration parameters such as resource sharing, cache capacities, prefetcher aggressiveness, and dynamic voltage-frequency scaling (DVFS). Moreover, it can operate under the constraints of either (i) high energy-efficiency (maximal GFLOPS/W), or (ii) high power-performance (maximal GFLOPS3/W). We evaluate SparseAdapt with sparse matrix-matrix and matrix-vector multiplication (SpMSpM and SpMSpV) routines across a suite of uniform random, power-law and real-world matrices, in addition to end-to-end evaluation on two graph algorithms. SparseAdapt achieves similar performance on SpMSpM as the largest static configuration, with 5.3x better energy-efficiency. Furthermore, on both performance and efficiency, SparseAdapt is at most within 13% of an Oracle that adapts the configuration of each phase with global knowledge of the entire program execution. Finally, SparseAdapt is able to outperform the state-of-the-art approach for runtime reconfiguration by up to 2.9x in terms of energy-efficiency.
引用
收藏
页码:1005 / 1021
页数:17
相关论文
共 50 条
  • [21] On the performance and energy efficiency of sparse linear algebra on GPUs
    Anzt, Hartwig
    Tomov, Stanimire
    Dongarra, Jack
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2017, 31 (05): : 375 - 390
  • [22] Memory Optimizations for Sparse Linear Algebra on GPU Hardware
    Walden, Aaron
    Zubair, Mohammad
    Stone, Christopher P.
    Nielsen, Eric J.
    PROCEEDINGS OF MCHPC 2021: WORKSHOP ON MEMORY CENTRIC HIGH PERFORMANCE COMPUTING, 2021, : 25 - 32
  • [23] Parallel sparse linear algebra and application to structural mechanics
    David Goudin
    Pascal Hénon
    François Pellegrini
    Pierre Ramet
    Jean Roman
    Jean-Jacques Pesqué
    Numerical Algorithms, 2000, 24 : 371 - 391
  • [24] A Runtime Reconfigurable Design of Compute-in-Memory-Based Hardware Accelerator for Deep Learning Inference
    Lu, Anni
    Peng, Xiaochen
    Luo, Yandong
    Huang, Shanshi
    Yu, Shimeng
    ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2021, 26 (06)
  • [25] Scalable Hybrid Designs for Linear Algebra on Reconfigurable Computing Systems
    Zhuo, Ling
    Prasanna, Viktor K.
    IEEE TRANSACTIONS ON COMPUTERS, 2008, 57 (12) : 1661 - 1675
  • [26] ACCELERATING LINEAR ALGEBRA KERNELS ON A MASSIVELY PARALLEL RECONFIGURABLE ARCHITECTURE
    Soorishetty, A.
    Zhou, J.
    Pal, S.
    Blaauw, D.
    Kim, H.
    Mudge, T.
    Dreslinski, R.
    Chakrabarti, C.
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 1558 - 1562
  • [27] Performance and energy analysis of OpenMP runtime systems with dense linear algebra algorithms
    Ferreira Lima, Joao Vicente
    Rais, Issam
    Lefevre, Laurent
    Gautier, Thierry
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2019, 33 (03): : 431 - 443
  • [28] Performance and Energy Analysis of OpenMP Runtime Systems with Dense Linear Algebra Algorithms
    Lima, Joao V. F.
    Rais, Issam
    Lefevre, Laurent
    Gautier, Thierry
    2017 INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING WORKSHOPS (SBAC-PADW), 2017, : 7 - 12
  • [30] SPARSE GRAPH BASED SKETCHING FOR FAST NUMERICAL LINEAR ALGEBRA
    Hu, Dong
    Ubaru, Shashanka
    Gittens, Alex
    Clarkson, Kenneth L.
    Horesh, Lior
    Kalantzis, Vassilis
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3255 - 3259