SHARP: An Adaptable, Energy-Efficient Accelerator for Recurrent Neural Networks

被引:3
|
作者
Aminabadi, Reza Yazdani [1 ]
Ruwase, Olatunji [2 ]
Zhang, Minjia [2 ]
He, Yuxiong [2 ]
Arnau, Jose-Maria [3 ]
Gonzalez, Antonio [3 ]
机构
[1] Microsoft, Quebec City, PQ, Canada
[2] Microsoft, Redmond, WA USA
[3] Univ Politecn Cataluna, Barcelona, Spain
基金
欧盟地平线“2020”; 欧洲研究理事会;
关键词
Recurrent Neural Network (RNN); Long-Short-Term Memory (LSTM); accelerator; scheduling; reconfigurability; low power;
D O I
10.1145/3552513
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The effectiveness of Recurrent Neural Networks (RNNs) for tasks such as Automatic Speech Recognition has fostered interest in RNN inference acceleration. Due to the recurrent nature and data dependencies of RNN computations, priorwork has designed customized architectures specifically tailored to the computation pattern of RNN, getting high computation efficiency for certain chosen model sizes. However, given that the dimensionality of RNNs varies a lot for different tasks, it is crucial to generalize this efficiency to diverse configurations. In this work, we identify adaptiveness as a key feature that is missing from today's RNN accelerators. In particular, we first show the problem of low resource utilization and low adaptiveness for the state-ofthe-art RNN implementations on GPU, FPGA, and ASIC architectures. To solve these issues, we propose an intelligent tiled-based dispatching mechanism for increasing the adaptiveness of RNN computation, in order to efficiently handle the data dependencies. To do so, we propose Sharp as a hardware accelerator, which pipelines RNN computation using an effective scheduling scheme to hide most of the dependent serialization. Furthermore, Sharp employs dynamic reconfigurable architecture to adapt to the model's characteristics. Sharp achieves 2x, 2.8x, and 82x speedups on average, considering different RNN models and resource budgets, compared to the state-of-the-art ASIC, FPGA, and GPU implementations, respectively. Furthermore, we provide significant energy reduction with respect to the previous solutions, due to the low power dissipation of Sharp (321 GFLOPS/Watt).
引用
收藏
页数:23
相关论文
共 50 条
  • [1] Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks
    Chen, Yu-Hsin
    Krishna, Tushar
    Emer, Joel
    Sze, Vivienne
    [J]. 2016 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE (ISSCC), 2016, 59 : 262 - U363
  • [2] PNeuro: a scalable energy-efficient programmable hardware accelerator for neural networks
    Carbon, A.
    Philippe, J-M.
    Bichler, O.
    Schmit, R.
    Tain, B.
    Briand, D.
    Ventroux, N.
    Paindavoine, M.
    Brousse, O.
    [J]. PROCEEDINGS OF THE 2018 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2018, : 1039 - 1044
  • [3] Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks
    Chen, Yu-Hsin
    Krishna, Tushar
    Emer, Joel S.
    Sze, Vivienne
    [J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2017, 52 (01) : 127 - 138
  • [4] GCNAX: A Flexible and Energy-efficient Accelerator for Graph Convolutional Neural Networks
    Li, Jiajun
    Louri, Ahmed
    Karanth, Avinash
    Bunescu, Razvan
    [J]. 2021 27TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2021), 2021, : 775 - 788
  • [5] PIE: A Pipeline Energy-efficient Accelerator for Inference Process in Deep Neural Networks
    Zhao, Yangyang
    Yu, Qi
    Zhou, Xuda
    Zhou, Xuehai
    Wang, Chao
    Li, Xi
    [J]. 2016 IEEE 22ND INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2016, : 1067 - 1074
  • [6] Vesti: Energy-Efficient In-Memory Computing Accelerator for Deep Neural Networks
    Yin, Shihui
    Jiang, Zhewei
    Kim, Minkyu
    Gupta, Tushar
    Seok, Mingoo
    Seo, Jae-Sun
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (01) : 48 - 61
  • [7] Energy-Efficient Inference Accelerator for Memory-Augmented Neural Networks on an FPGA
    Park, Seongsik
    Jang, Jaehee
    Kim, Seijoon
    Yoon, Sungroh
    [J]. 2019 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2019, : 1587 - 1590
  • [8] Energy-Efficient Convolutional Neural Networks via Recurrent Data Reuse
    Mocerino, Luca
    Tenace, Valerio
    Calimera, Andrea
    [J]. 2019 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2019, : 848 - 853
  • [9] Dynamic Beam Width Tuning for Energy-Efficient Recurrent Neural Networks
    Pagliari, Daniele Jahier
    Panini, Francesco
    Macii, Enrico
    Poncino, Massimo
    [J]. GLSVLSI '19 - PROCEEDINGS OF THE 2019 ON GREAT LAKES SYMPOSIUM ON VLSI, 2019, : 69 - 74
  • [10] An Energy-Efficient Deep Neural Network Accelerator Design
    Jung, Jueun
    Lee, Kyuho Jason
    [J]. 2020 54TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2020, : 272 - 276