SHARP: An Adaptable, Energy-Efficient Accelerator for Recurrent Neural Networks

被引：3

作者：

Aminabadi, Reza Yazdani ^{[1
]}

Ruwase, Olatunji ^{[2
]}

Zhang, Minjia ^{[2
]}

He, Yuxiong ^{[2
]}

Arnau, Jose-Maria ^{[3
]}

Gonzalez, Antonio ^{[3
]}

机构：

[1] Microsoft, Quebec City, PQ, Canada

[2] Microsoft, Redmond, WA USA

[3] Univ Politecn Cataluna, Barcelona, Spain

来源：

ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS | 2023年 / 22卷 / 02期

基金：

欧盟地平线“2020”; 欧洲研究理事会;

关键词：

Recurrent Neural Network (RNN); Long-Short-Term Memory (LSTM); accelerator; scheduling; reconfigurability; low power;

D O I：

10.1145/3552513

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

The effectiveness of Recurrent Neural Networks (RNNs) for tasks such as Automatic Speech Recognition has fostered interest in RNN inference acceleration. Due to the recurrent nature and data dependencies of RNN computations, priorwork has designed customized architectures specifically tailored to the computation pattern of RNN, getting high computation efficiency for certain chosen model sizes. However, given that the dimensionality of RNNs varies a lot for different tasks, it is crucial to generalize this efficiency to diverse configurations. In this work, we identify adaptiveness as a key feature that is missing from today's RNN accelerators. In particular, we first show the problem of low resource utilization and low adaptiveness for the state-ofthe-art RNN implementations on GPU, FPGA, and ASIC architectures. To solve these issues, we propose an intelligent tiled-based dispatching mechanism for increasing the adaptiveness of RNN computation, in order to efficiently handle the data dependencies. To do so, we propose Sharp as a hardware accelerator, which pipelines RNN computation using an effective scheduling scheme to hide most of the dependent serialization. Furthermore, Sharp employs dynamic reconfigurable architecture to adapt to the model's characteristics. Sharp achieves 2x, 2.8x, and 82x speedups on average, considering different RNN models and resource budgets, compared to the state-of-the-art ASIC, FPGA, and GPU implementations, respectively. Furthermore, we provide significant energy reduction with respect to the previous solutions, due to the low power dissipation of Sharp (321 GFLOPS/Watt).

引用

页数：23

共 50 条

[1] Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks
Chen, Yu-Hsin
Krishna, Tushar
Emer, Joel
Sze, Vivienne
[J]. 2016 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE (ISSCC), 2016, 59 : 262 - U363
[2] PNeuro: a scalable energy-efficient programmable hardware accelerator for neural networks
Carbon, A.
Philippe, J-M.
Bichler, O.
Schmit, R.
Tain, B.
Briand, D.
Ventroux, N.
Paindavoine, M.
Brousse, O.
[J]. PROCEEDINGS OF THE 2018 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2018, : 1039 - 1044
[3] Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks
Chen, Yu-Hsin
Krishna, Tushar
Emer, Joel S.
Sze, Vivienne
[J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2017, 52 (01) : 127 - 138
[4] GCNAX: A Flexible and Energy-efficient Accelerator for Graph Convolutional Neural Networks
Li, Jiajun
Louri, Ahmed
Karanth, Avinash
Bunescu, Razvan
[J]. 2021 27TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2021), 2021, : 775 - 788
[5] PIE: A Pipeline Energy-efficient Accelerator for Inference Process in Deep Neural Networks
Zhao, Yangyang
Yu, Qi
Zhou, Xuda
Zhou, Xuehai
Wang, Chao
Li, Xi
[J]. 2016 IEEE 22ND INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2016, : 1067 - 1074
[6] Vesti: Energy-Efficient In-Memory Computing Accelerator for Deep Neural Networks
Yin, Shihui
Jiang, Zhewei
Kim, Minkyu
Gupta, Tushar
Seok, Mingoo
Seo, Jae-Sun
[J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (01) : 48 - 61
[7] Energy-Efficient Inference Accelerator for Memory-Augmented Neural Networks on an FPGA
Park, Seongsik
Jang, Jaehee
Kim, Seijoon
Yoon, Sungroh
[J]. 2019 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2019, : 1587 - 1590
[8] Energy-Efficient Convolutional Neural Networks via Recurrent Data Reuse
Mocerino, Luca
Tenace, Valerio
Calimera, Andrea
[J]. 2019 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2019, : 848 - 853
[9] Dynamic Beam Width Tuning for Energy-Efficient Recurrent Neural Networks
Pagliari, Daniele Jahier
Panini, Francesco
Macii, Enrico
Poncino, Massimo
[J]. GLSVLSI '19 - PROCEEDINGS OF THE 2019 ON GREAT LAKES SYMPOSIUM ON VLSI, 2019, : 69 - 74
[10] An Energy-Efficient Deep Neural Network Accelerator Design
Jung, Jueun
Lee, Kyuho Jason
[J]. 2020 54TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2020, : 272 - 276

← 1 2 3 4 5 →