SparseAdapt: Runtime Control for Sparse Linear Algebra on a Reconfigurable Accelerator

被引：9

作者：

Pal, Subhankar ^{[1
]}

Amarnath, Aporva ^{[1
]}

Feng, Siying ^{[1
]}

O'Boyle, Michael ^{[2
]}

Dreslinski, Ronald ^{[1
]}

Dubach, Christophe ^{[3
]}

机构：

[1] Univ Michigan, Ann Arbor, MI 48109 USA

[2] Univ Edinburgh, Edinburgh, Midlothian, Scotland

[3] McGill Univ, Montreal, PQ, Canada

来源：

PROCEEDINGS OF 54TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, MICRO 2021 | 2021年

基金：

加拿大自然科学与工程研究理事会; 英国工程与自然科学研究理事会;

关键词：

reconfigurable accelerators; sparse linear algebra; energy-efficient computing; machine learning; predictive models;

D O I：

10.1145/3466752.3480134

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Dynamic adaptation is a post-silicon optimization technique that adapts the hardware to workload phases. However, current adaptive approaches are oblivious to implicit phases that arise from operating on irregular data, such as sparse linear algebra operations. Implicit phases are short-lived and do not exhibit consistent behavior throughout execution. This calls for a high-accuracy, low overhead runtime mechanism for adaptation at a fine granularity. Moreover, adopting such techniques for reconfigurable manycore hardware, such as coarse-grained reconfigurable architectures (CGRAs), adds complexity due to synchronization and resource contention. We propose a lightweight machine learning-based adaptive framework called SparseAdapt. It enables low-overhead control of configuration parameters to tailor the hardware to both implicit (datadriven) and explicit (code-driven) phase changes. SparseAdapt is implemented within the runtime of a recently-proposed CGRA called Transmuter, which has been shown to deliver high performance for irregular sparse operations. SparseAdapt can adapt configuration parameters such as resource sharing, cache capacities, prefetcher aggressiveness, and dynamic voltage-frequency scaling (DVFS). Moreover, it can operate under the constraints of either (i) high energy-efficiency (maximal GFLOPS/W), or (ii) high power-performance (maximal GFLOPS3/W). We evaluate SparseAdapt with sparse matrix-matrix and matrix-vector multiplication (SpMSpM and SpMSpV) routines across a suite of uniform random, power-law and real-world matrices, in addition to end-to-end evaluation on two graph algorithms. SparseAdapt achieves similar performance on SpMSpM as the largest static configuration, with 5.3x better energy-efficiency. Furthermore, on both performance and efficiency, SparseAdapt is at most within 13% of an Oracle that adapts the configuration of each phase with global knowledge of the entire program execution. Finally, SparseAdapt is able to outperform the state-of-the-art approach for runtime reconfiguration by up to 2.9x in terms of energy-efficiency.

引用

页码：1005 / 1021

页数：17

共 50 条

[21] On the performance and energy efficiency of sparse linear algebra on GPUs
Anzt, Hartwig
Tomov, Stanimire
Dongarra, Jack
INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2017, 31 (05): : 375 - 390
[22] Memory Optimizations for Sparse Linear Algebra on GPU Hardware
Walden, Aaron
Zubair, Mohammad
Stone, Christopher P.
Nielsen, Eric J.
PROCEEDINGS OF MCHPC 2021: WORKSHOP ON MEMORY CENTRIC HIGH PERFORMANCE COMPUTING, 2021, : 25 - 32
[23] Parallel sparse linear algebra and application to structural mechanics
David Goudin
Pascal Hénon
François Pellegrini
Pierre Ramet
Jean Roman
Jean-Jacques Pesqué
Numerical Algorithms, 2000, 24 : 371 - 391
[24] A Runtime Reconfigurable Design of Compute-in-Memory-Based Hardware Accelerator for Deep Learning Inference
Lu, Anni
Peng, Xiaochen
Luo, Yandong
Huang, Shanshi
Yu, Shimeng
ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2021, 26 (06)
[25] Scalable Hybrid Designs for Linear Algebra on Reconfigurable Computing Systems
Zhuo, Ling
Prasanna, Viktor K.
IEEE TRANSACTIONS ON COMPUTERS, 2008, 57 (12) : 1661 - 1675
[26] ACCELERATING LINEAR ALGEBRA KERNELS ON A MASSIVELY PARALLEL RECONFIGURABLE ARCHITECTURE
Soorishetty, A.
Zhou, J.
Pal, S.
Blaauw, D.
Kim, H.
Mudge, T.
Dreslinski, R.
Chakrabarti, C.
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 1558 - 1562
[27] Performance and energy analysis of OpenMP runtime systems with dense linear algebra algorithms
Ferreira Lima, Joao Vicente
Rais, Issam
Lefevre, Laurent
Gautier, Thierry
INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2019, 33 (03): : 431 - 443
[28] Performance and Energy Analysis of OpenMP Runtime Systems with Dense Linear Algebra Algorithms
Lima, Joao V. F.
Rais, Issam
Lefevre, Laurent
Gautier, Thierry
2017 INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING WORKSHOPS (SBAC-PADW), 2017, : 7 - 12
[29] A SYSTOLIC ACCELERATOR FOR THE ITERATIVE SOLUTION OF SPARSE LINEAR-SYSTEMS
MELHEM, R
IEEE TRANSACTIONS ON COMPUTERS, 1989, 38 (11) : 1591 - 1595
[30] SPARSE GRAPH BASED SKETCHING FOR FAST NUMERICAL LINEAR ALGEBRA
Hu, Dong
Ubaru, Shashanka
Gittens, Alex
Clarkson, Kenneth L.
Horesh, Lior
Kalantzis, Vassilis
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3255 - 3259

← 1 2 3 4 5 →