SparseAdapt: Runtime Control for Sparse Linear Algebra on a Reconfigurable Accelerator

被引:9
|
作者
Pal, Subhankar [1 ]
Amarnath, Aporva [1 ]
Feng, Siying [1 ]
O'Boyle, Michael [2 ]
Dreslinski, Ronald [1 ]
Dubach, Christophe [3 ]
机构
[1] Univ Michigan, Ann Arbor, MI 48109 USA
[2] Univ Edinburgh, Edinburgh, Midlothian, Scotland
[3] McGill Univ, Montreal, PQ, Canada
基金
加拿大自然科学与工程研究理事会; 英国工程与自然科学研究理事会;
关键词
reconfigurable accelerators; sparse linear algebra; energy-efficient computing; machine learning; predictive models;
D O I
10.1145/3466752.3480134
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Dynamic adaptation is a post-silicon optimization technique that adapts the hardware to workload phases. However, current adaptive approaches are oblivious to implicit phases that arise from operating on irregular data, such as sparse linear algebra operations. Implicit phases are short-lived and do not exhibit consistent behavior throughout execution. This calls for a high-accuracy, low overhead runtime mechanism for adaptation at a fine granularity. Moreover, adopting such techniques for reconfigurable manycore hardware, such as coarse-grained reconfigurable architectures (CGRAs), adds complexity due to synchronization and resource contention. We propose a lightweight machine learning-based adaptive framework called SparseAdapt. It enables low-overhead control of configuration parameters to tailor the hardware to both implicit (datadriven) and explicit (code-driven) phase changes. SparseAdapt is implemented within the runtime of a recently-proposed CGRA called Transmuter, which has been shown to deliver high performance for irregular sparse operations. SparseAdapt can adapt configuration parameters such as resource sharing, cache capacities, prefetcher aggressiveness, and dynamic voltage-frequency scaling (DVFS). Moreover, it can operate under the constraints of either (i) high energy-efficiency (maximal GFLOPS/W), or (ii) high power-performance (maximal GFLOPS3/W). We evaluate SparseAdapt with sparse matrix-matrix and matrix-vector multiplication (SpMSpM and SpMSpV) routines across a suite of uniform random, power-law and real-world matrices, in addition to end-to-end evaluation on two graph algorithms. SparseAdapt achieves similar performance on SpMSpM as the largest static configuration, with 5.3x better energy-efficiency. Furthermore, on both performance and efficiency, SparseAdapt is at most within 13% of an Oracle that adapts the configuration of each phase with global knowledge of the entire program execution. Finally, SparseAdapt is able to outperform the state-of-the-art approach for runtime reconfiguration by up to 2.9x in terms of energy-efficiency.
引用
收藏
页码:1005 / 1021
页数:17
相关论文
共 50 条
  • [1] ExTensor: An Accelerator for Sparse Tensor Algebra
    Hegde, Kartik
    Asghari-Moghaddam, Hadi
    Pellauer, Michael
    Crago, Neal
    Jaleel, Aamer
    Solomonik, Edgar
    Emer, Joel S.
    Fletcher, Christopher W.
    MICRO'52: THE 52ND ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, 2019, : 319 - 333
  • [2] An Energy Efficient and Runtime Reconfigurable Accelerator for Robotic Localization
    Liu, Qiang
    Hao, Yuhui
    Liu, Weizhuang
    Yu, Bo
    Gan, Yiming
    Tang, Jie
    Liu, Shao-Shan
    Zhu, Yuhao
    IEEE TRANSACTIONS ON COMPUTERS, 2023, 72 (07) : 1943 - 1957
  • [3] Evaluation of an Analog Accelerator for Linear Algebra
    Huang, Yipeng
    Guo, Ning
    Seok, Mingoo
    Tsividis, Yannis
    Sethumadhavan, Simha
    2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, : 570 - 582
  • [4] Runtime Reconfigurable Hardware Accelerator for Energy-Efficient Transposed Convolutions
    Marrazzo, Emanuel
    Spagnolo, Fanny
    Perri, Stefania
    PRIME 2022: 17TH INTERNATIONAL CONFERENCE ON PHD RESEARCH IN MICROELECTRONICS AND ELECTRONICS, 2022, : 49 - 52
  • [5] A Runtime Reconfigurable Design of Compute-in-Memory based Hardware Accelerator
    Lu, Anni
    Peng, Xiaochen
    Luo, Yandong
    Huang, Shanshi
    Yu, Shimeng
    PROCEEDINGS OF THE 2021 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2021), 2021, : 932 - 937
  • [6] RAMAN: A Reconfigurable and Sparse tinyML Accelerator for Inference on Edge
    Krishna, Adithya
    Rohit Nudurupati, Srikanth
    Chandana, D. G.
    Dwivedi, Pritesh
    van Schaik, Andre
    Mehendale, Mahesh
    Thakur, Chetan Singh
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (14): : 24831 - 24845
  • [7] ALRESCHA: A Lightweight Reconfigurable Sparse-Computation Accelerator
    Asgari, Bahar
    Hadidi, Ramyad
    Krishna, Tushar
    Kim, Hyesoon
    Yalamanchili, Sudhakar
    2020 IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2020), 2020, : 249 - 260
  • [8] Transforming A Linear Algebra Core to An FFT Accelerator
    Pedram, Ardavan
    McCalpin, John
    Gerstlauer, Andreas
    PROCEEDINGS OF THE 2013 IEEE 24TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP 13), 2013, : 175 - 184
  • [9] Porting Sparse Linear Algebra to Intel GPUs
    Tsai, Yuhsiang M.
    Cojean, Terry
    Anzt, Hartwig
    EURO-PAR 2021: PARALLEL PROCESSING WORKSHOPS, 2022, 13098 : 57 - 68
  • [10] Usability levels for sparse linear algebra components
    Sosonkina, M.
    Liu, F.
    Bramley, R.
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2008, 20 (12): : 1439 - 1454