Implementation and Optimization of the Accelerator Based on FPGA Hardware for LSTM Network

被引:10
|
作者
Zhang, Yiwei [1 ]
Wang, Chao
Gong, Lei
Lu, Yuntao
Sun, Fan
Xu, Chongchong
Li, Xi
Zhou, Xuehai
机构
[1] USTC, Dept Comp Sci & Technol, Hefei 230027, Anhui, Peoples R China
基金
美国国家科学基金会;
关键词
D O I
10.1109/ISPA/IUCC.2017.00098
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Today, artificial neural networks (ANNs) are important machine learning methods which are widely used in a variety of applications. As the emerging field of ANNs, recurrent neural networks (RNNs) are often used for sequence-related applications. And Long Short-Term Memory (LSTM) is an improved RNN which contains complex computational logic. To achieve high accuracy, researchers always build large-scale LSTM networks which are time-consuming and power-consuming. Thus the acceleration of LSTM networks, low power & energy consumption become the hot issues in today's research. In this paper, we present a hardware accelerator for the LSTM neural network layer based on FPGA Zedboard and use pipeline methods to parallelize the forward computing process. To optimize our implementation, we also use multiple methods including tiled matrix-vector multiplication, binary adder tree, and overlap of computation and data access. Through the acceleration and optimization methods, our accelerator is power-efficient and has a better performance than ARM Cortex A9 processor and Intel Core i5 processor.
引用
收藏
页码:614 / 621
页数:8
相关论文
共 50 条
  • [41] A Hardware-Oriented Echo State Network for FPGA Implementation
    Honda, Kentaro
    Tamukoh, Hakaru
    PROCEEDINGS OF THE 2020 INTERNATIONAL CONFERENCE ON ARTIFICIAL LIFE AND ROBOTICS (ICAROB2020), 2020, : 187 - 190
  • [42] A digital neural network FPGA direct hardware implementation algorithm
    Dinu, Andrei
    Cirstea, Marcian
    2007 IEEE INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS, PROCEEDINGS, VOLS 1-8, 2007, : 2307 - +
  • [43] Hardware implementation of the simplified digital spiking neural network on FPGA
    Lee K.
    Kim Y.
    IEIE Transactions on Smart Processing and Computing, 2019, 8 (05): : 405 - 414
  • [44] Hardware implementation of CMAC neural network using FPGA approach
    Chung, Chao-Ming
    Lin, Chih-Min
    Chiang, Ching-Tsan
    Yeung, Daniel S.
    PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 2005 - +
  • [45] Implementation of an FPGA based accelerator for Virtual Private Networks
    Cheung, OYH
    Leong, PHW
    2002 IEEE INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT), PROCEEDINGS, 2002, : 34 - 41
  • [46] Winograd Neural Network Accelerator Using Dynamic Hardware Reconfiguration on FPGA Platform
    Mei, Bingxiao
    Teng, Wenbin
    Zhang, Chi
    Wang, Wenhao
    Li, Fuqiang
    Yuan, Fuli
    Computer Engineering and Applications, 2024, 60 (22) : 323 - 334
  • [47] Hardware optimization of compressed sensing based on FPGA
    1600, International Frequency Sensor Association, 46 Thorny Vineway, Toronto, ON M2J 4J2, Canada (25):
  • [48] A survey of FPGA-based hardware implementation of ANNs
    Liu, JH
    Liang, DQ
    PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS AND BRAIN, VOLS 1-3, 2005, : 915 - 918
  • [49] An FPGA-based Hardware Accelerator for Scene Text Character Recognition
    de Oliveira Junior, Luiz Antonio
    Barros, Edna
    PROCEEDINGS OF THE 2018 26TH IFIP/IEEE INTERNATIONAL CONFERENCE ON VERY LARGE SCALE INTEGRATION (VLSI-SOC), 2018, : 125 - 130
  • [50] A FPGA-HBM-based Hardware Streaming Accelerator for GNN Sampling
    Gui, Yuchen
    Wu, Qizhe
    Yuan, Wei
    Liang, Huawen
    Wang, Xiaotian
    Jin, Xi
    2024 IEEE 35TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS, ASAP 2024, 2024, : 77 - 78