Implementation and Optimization of the Accelerator Based on FPGA Hardware for LSTM Network

被引：10

作者：

Zhang, Yiwei ^{[1
]}

Wang, Chao

Gong, Lei

Lu, Yuntao

Sun, Fan

Xu, Chongchong

Li, Xi

Zhou, Xuehai

机构：

[1] USTC, Dept Comp Sci & Technol, Hefei 230027, Anhui, Peoples R China

来源：

2017 15TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS AND 2017 16TH IEEE INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING AND COMMUNICATIONS (ISPA/IUCC 2017) | 2017年

基金：

美国国家科学基金会;

关键词：

D O I：

10.1109/ISPA/IUCC.2017.00098

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Today, artificial neural networks (ANNs) are important machine learning methods which are widely used in a variety of applications. As the emerging field of ANNs, recurrent neural networks (RNNs) are often used for sequence-related applications. And Long Short-Term Memory (LSTM) is an improved RNN which contains complex computational logic. To achieve high accuracy, researchers always build large-scale LSTM networks which are time-consuming and power-consuming. Thus the acceleration of LSTM networks, low power & energy consumption become the hot issues in today's research. In this paper, we present a hardware accelerator for the LSTM neural network layer based on FPGA Zedboard and use pipeline methods to parallelize the forward computing process. To optimize our implementation, we also use multiple methods including tiled matrix-vector multiplication, binary adder tree, and overlap of computation and data access. Through the acceleration and optimization methods, our accelerator is power-efficient and has a better performance than ARM Cortex A9 processor and Intel Core i5 processor.

引用

页码：614 / 621

页数：8

共 50 条

[41] A Hardware-Oriented Echo State Network for FPGA Implementation
Honda, Kentaro
Tamukoh, Hakaru
PROCEEDINGS OF THE 2020 INTERNATIONAL CONFERENCE ON ARTIFICIAL LIFE AND ROBOTICS (ICAROB2020), 2020, : 187 - 190
[42] A digital neural network FPGA direct hardware implementation algorithm
Dinu, Andrei
Cirstea, Marcian
2007 IEEE INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS, PROCEEDINGS, VOLS 1-8, 2007, : 2307 - +
[43] Hardware implementation of the simplified digital spiking neural network on FPGA
Lee K.
Kim Y.
IEIE Transactions on Smart Processing and Computing, 2019, 8 (05): : 405 - 414
[44] Hardware implementation of CMAC neural network using FPGA approach
Chung, Chao-Ming
Lin, Chih-Min
Chiang, Ching-Tsan
Yeung, Daniel S.
PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 2005 - +
[45] Implementation of an FPGA based accelerator for Virtual Private Networks
Cheung, OYH
Leong, PHW
2002 IEEE INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT), PROCEEDINGS, 2002, : 34 - 41
[46] Winograd Neural Network Accelerator Using Dynamic Hardware Reconfiguration on FPGA Platform
Mei, Bingxiao
Teng, Wenbin
Zhang, Chi
Wang, Wenhao
Li, Fuqiang
Yuan, Fuli
Computer Engineering and Applications, 2024, 60 (22) : 323 - 334
[47] Hardware optimization of compressed sensing based on FPGA
1600, International Frequency Sensor Association, 46 Thorny Vineway, Toronto, ON M2J 4J2, Canada (25):
[48] A survey of FPGA-based hardware implementation of ANNs
Liu, JH
Liang, DQ
PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS AND BRAIN, VOLS 1-3, 2005, : 915 - 918
[49] An FPGA-based Hardware Accelerator for Scene Text Character Recognition
de Oliveira Junior, Luiz Antonio
Barros, Edna
PROCEEDINGS OF THE 2018 26TH IFIP/IEEE INTERNATIONAL CONFERENCE ON VERY LARGE SCALE INTEGRATION (VLSI-SOC), 2018, : 125 - 130
[50] A FPGA-HBM-based Hardware Streaming Accelerator for GNN Sampling
Gui, Yuchen
Wu, Qizhe
Yuan, Wei
Liang, Huawen
Wang, Xiaotian
Jin, Xi
2024 IEEE 35TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS, ASAP 2024, 2024, : 77 - 78

← 1 2 3 4 5 →