POLAR: A Pipelined/Overlapped FPGA-Based LSTM Accelerator

被引:34
|
作者
Bank-Tavakoli, Erfan [1 ]
Ghasemzadeh, Seyed Abolfazl [1 ]
Kamal, Mehdi [1 ]
Afzali-Kusha, Ali [1 ]
Pedram, Massoud [2 ]
机构
[1] Univ Tehran, Coll Engn, Sch Elect & Comp Engn, Tehran 1417466191, Iran
[2] Univ Southern Calif, Dept Elect Engn, Los Angeles, CA 90007 USA
关键词
Computer architecture; Field programmable gate arrays; Logic gates; Resource management; Timing; Power demand; Hardware; Field-programmable gate array (FPGA); high speed; low power; low resource utilization; long short-term memory (LSTM) accelerator;
D O I
10.1109/TVLSI.2019.2947639
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this brief, a low resource utilization field-programmable gate array (FPGA)-based long short-term memory (LSTM) network architecture for accelerating the inference phase is presented. The architecture has low-power and high-speed features that are achieved through overlapping the timing of the operations and pipelining the datapath. Moreover, this architecture requires negligible internal memory size for storing the intermediate data leading to low resource utilization and simple routing, which provides lower interconnect delay (higher operating frequency). A designer may adjust the resource utilization (as well as the latency) of the proposed architecture readily at the registertransfer level (RTL) design by adjusting the amount of parallelization. This makes the process of mapping the architecture onto different types of FPGAs, subject to defined constraints, a simple one. The efficacy of the proposed architecture is assessed by implementing an LSTM network on different types of FPGAs. Compared with the recent works, the proposed architecture provides up to about 1.6x, 43.6x, 21.9x, and 114.5x improvements in frequency, power efficiency, GOP/s, and GOP/s/W, respectively. Finally, our proposed architecture operates at 17.64 GOP/s, which is 2.31x faster than the best previously reported results.
引用
收藏
页码:838 / 842
页数:5
相关论文
共 50 条
  • [1] FPGA-based Pipelined LSTM accelerator with Approximate matrix multiplication technique
    Chaudhary, Aniket
    Kumar, Arun
    Srivastava, Ayush
    Suneja, Kriti
    [J]. 2021 5TH INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, COMMUNICATION, COMPUTER TECHNOLOGIES AND OPTIMIZATION TECHNIQUES (ICEECCOT), 2021, : 438 - 442
  • [2] Flexible Deep-pipelined FPGA-based Accelerator for Spiking Neural Networks
    Lopez-Asuncion, Samuel
    Ituero Herrero, Pablo
    [J]. 2023 38TH CONFERENCE ON DESIGN OF CIRCUITS AND INTEGRATED SYSTEMS, DCIS, 2023,
  • [3] A pipelined fast 2D-DCT accelerator for FPGA-based SoCs
    Tumeo, Antonino
    Monchiero, Matteo
    Palermo, Gianluca
    Ferrandi, Fabrizio
    Sciuto, Donatella
    [J]. IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI, PROCEEDINGS: EMERGING VLSI TECHNOLOGIES AND ARCHITECTURES, 2007, : 331 - +
  • [4] An FPGA-based Accelerator for Rapid Simulation of SC Decoding of Polar Codes
    Wuethrich, Johannes
    Balatsoukas-Stimming, Alexios
    Burg, Andreas
    [J]. 2015 IEEE CONFERENCE ON ELECTRONICS, CIRCUITS, AND SYSTEMS (ICECS), 2015, : 633 - 636
  • [5] A deeply-pipelined FPGA-based SpMV accelerator with a hardware-friendly storage scheme
    Guo, Song
    Dou, Yong
    Lei, Yuanwu
    Wu, Guiming
    [J]. IEICE ELECTRONICS EXPRESS, 2015, 12 (11):
  • [6] An FPGA-Based accelerator for multiphysics modeling
    Huang, XM
    Ma, J
    [J]. ERSA '04: THE 2004 INTERNATIONAL CONFERENCE ON ENGINEERING OF RECONFIGURABLE SYSTEMS AND ALGORITHMS, 2004, : 209 - 212
  • [7] Top-down implementation of pipelined AES cipher and its verification with FPGA-based simulation accelerator
    Lee, JG
    Hwangbo, W
    Kim, S
    Kyung, CM
    [J]. 2005 6TH INTERNATIONAL CONFERENCE ON ASIC PROCEEDINGS, BOOKS 1 AND 2, 2005, : 140 - 143
  • [8] A Method for Automatically Implementing FPGA-based Pipelined Microprocessors
    Zeng, Yu-xiang
    Wan, Han
    Jiang, Bo
    Gao, Xiao-peng
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTER NETWORKS AND COMMUNICATION TECHNOLOGY (CNCT 2016), 2016, 54 : 467 - 474
  • [9] Pipelined Parallel Join and Its FPGA-Based Acceleration
    Yoshimi, Masato
    Oge, Yasin
    Yoshinaga, Tsutomu
    [J]. ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2017, 10 (04)
  • [10] FPGA-Based Vehicle Detection and Tracking Accelerator
    Zhai, Jiaqi
    Li, Bin
    Lv, Shunsen
    Zhou, Qinglei
    [J]. SENSORS, 2023, 23 (04)