SparseAdapt: Runtime Control for Sparse Linear Algebra on a Reconfigurable Accelerator

被引：9

作者：

Pal, Subhankar ^{[1
]}

Amarnath, Aporva ^{[1
]}

Feng, Siying ^{[1
]}

O'Boyle, Michael ^{[2
]}

Dreslinski, Ronald ^{[1
]}

Dubach, Christophe ^{[3
]}

机构：

[1] Univ Michigan, Ann Arbor, MI 48109 USA

[2] Univ Edinburgh, Edinburgh, Midlothian, Scotland

[3] McGill Univ, Montreal, PQ, Canada

来源：

PROCEEDINGS OF 54TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, MICRO 2021 | 2021年

基金：

加拿大自然科学与工程研究理事会; 英国工程与自然科学研究理事会;

关键词：

reconfigurable accelerators; sparse linear algebra; energy-efficient computing; machine learning; predictive models;

D O I：

10.1145/3466752.3480134

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Dynamic adaptation is a post-silicon optimization technique that adapts the hardware to workload phases. However, current adaptive approaches are oblivious to implicit phases that arise from operating on irregular data, such as sparse linear algebra operations. Implicit phases are short-lived and do not exhibit consistent behavior throughout execution. This calls for a high-accuracy, low overhead runtime mechanism for adaptation at a fine granularity. Moreover, adopting such techniques for reconfigurable manycore hardware, such as coarse-grained reconfigurable architectures (CGRAs), adds complexity due to synchronization and resource contention. We propose a lightweight machine learning-based adaptive framework called SparseAdapt. It enables low-overhead control of configuration parameters to tailor the hardware to both implicit (datadriven) and explicit (code-driven) phase changes. SparseAdapt is implemented within the runtime of a recently-proposed CGRA called Transmuter, which has been shown to deliver high performance for irregular sparse operations. SparseAdapt can adapt configuration parameters such as resource sharing, cache capacities, prefetcher aggressiveness, and dynamic voltage-frequency scaling (DVFS). Moreover, it can operate under the constraints of either (i) high energy-efficiency (maximal GFLOPS/W), or (ii) high power-performance (maximal GFLOPS3/W). We evaluate SparseAdapt with sparse matrix-matrix and matrix-vector multiplication (SpMSpM and SpMSpV) routines across a suite of uniform random, power-law and real-world matrices, in addition to end-to-end evaluation on two graph algorithms. SparseAdapt achieves similar performance on SpMSpM as the largest static configuration, with 5.3x better energy-efficiency. Furthermore, on both performance and efficiency, SparseAdapt is at most within 13% of an Oracle that adapts the configuration of each phase with global knowledge of the entire program execution. Finally, SparseAdapt is able to outperform the state-of-the-art approach for runtime reconfiguration by up to 2.9x in terms of energy-efficiency.

引用

页码：1005 / 1021

页数：17

共 50 条

[1] ExTensor: An Accelerator for Sparse Tensor Algebra
Hegde, Kartik
Asghari-Moghaddam, Hadi
Pellauer, Michael
Crago, Neal
Jaleel, Aamer
Solomonik, Edgar
Emer, Joel S.
Fletcher, Christopher W.
MICRO'52: THE 52ND ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, 2019, : 319 - 333
[2] An Energy Efficient and Runtime Reconfigurable Accelerator for Robotic Localization
Liu, Qiang
Hao, Yuhui
Liu, Weizhuang
Yu, Bo
Gan, Yiming
Tang, Jie
Liu, Shao-Shan
Zhu, Yuhao
IEEE TRANSACTIONS ON COMPUTERS, 2023, 72 (07) : 1943 - 1957
[3] Evaluation of an Analog Accelerator for Linear Algebra
Huang, Yipeng
Guo, Ning
Seok, Mingoo
Tsividis, Yannis
Sethumadhavan, Simha
2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, : 570 - 582
[4] Runtime Reconfigurable Hardware Accelerator for Energy-Efficient Transposed Convolutions
Marrazzo, Emanuel
Spagnolo, Fanny
Perri, Stefania
PRIME 2022: 17TH INTERNATIONAL CONFERENCE ON PHD RESEARCH IN MICROELECTRONICS AND ELECTRONICS, 2022, : 49 - 52
[5] A Runtime Reconfigurable Design of Compute-in-Memory based Hardware Accelerator
Lu, Anni
Peng, Xiaochen
Luo, Yandong
Huang, Shanshi
Yu, Shimeng
PROCEEDINGS OF THE 2021 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2021), 2021, : 932 - 937
[6] RAMAN: A Reconfigurable and Sparse tinyML Accelerator for Inference on Edge
Krishna, Adithya
Rohit Nudurupati, Srikanth
Chandana, D. G.
Dwivedi, Pritesh
van Schaik, Andre
Mehendale, Mahesh
Thakur, Chetan Singh
IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (14): : 24831 - 24845
[7] ALRESCHA: A Lightweight Reconfigurable Sparse-Computation Accelerator
Asgari, Bahar
Hadidi, Ramyad
Krishna, Tushar
Kim, Hyesoon
Yalamanchili, Sudhakar
2020 IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2020), 2020, : 249 - 260
[8] Transforming A Linear Algebra Core to An FFT Accelerator
Pedram, Ardavan
McCalpin, John
Gerstlauer, Andreas
PROCEEDINGS OF THE 2013 IEEE 24TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP 13), 2013, : 175 - 184
[9] Porting Sparse Linear Algebra to Intel GPUs
Tsai, Yuhsiang M.
Cojean, Terry
Anzt, Hartwig
EURO-PAR 2021: PARALLEL PROCESSING WORKSHOPS, 2022, 13098 : 57 - 68
[10] Usability levels for sparse linear algebra components
Sosonkina, M.
Liu, F.
Bramley, R.
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2008, 20 (12): : 1439 - 1454

← 1 2 3 4 5 →