Resource and Data Optimization for Hardware Implementation of Deep Neural Networks Targeting FPGA-based Edge Devices

被引：2

作者：

Liu, Xinheng ^{[1
,2
]}

Kim, Dae Hee ^{[1
]}

Wu, Chang ^{[3
]}

Chen, Deming ^{[1
,2
]}

机构：

[1] Univ Illinois, Urbana, IL 61801 USA

[2] Inspirit IoT Inc, Champaign, IL 61822 USA

[3] Fudan Univ, Shanghai, Peoples R China

来源：

2018 ACM/IEEE INTERNATIONAL WORKSHOP ON SYSTEM LEVEL INTERCONNECT PREDICTION (SLIP) | 2018年

关键词：

FPGA; Convolutional Neural Network; Optimization; Acceleration; High-Level Synthesis;

D O I：

10.1145/3225209.3225214

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Recently, as machine learning algorithms have become more practical, there has been much effort to implement them on edge devices that can be used in our daily lives. However, unlike server-scale devices, edge devices are relatively small and thus have much more limited resources. Therefore, control of resource usage and hardware optimization play an important role when we implement machine learning algorithms on an edge device. In this paper, we target convolutional neural networks (CNN) and explore various optimization and design techniques to realize them on FPGA devices. The key idea explored in this paper is Backward Pipeline Scheduling together with Latency Balancing which optimize the pipeline between CNN layers in order to significantly reduce the overall latency for processing a single image. We also develop a batch processing design to improve the throughput of the FPGA solution. We have achieved latency of 175.7 mu s for classifying one image in the MNIST data set using LeNet and 653.4 mu s for classifying one image in Cifar-10 data set using CifarNet. Without retraining, we are still able to maintain high accuracy of 97.6% for MNIST data set and 83.6% for the Cifar-10 data set. Our achieved single-image latency is 5.2x faster for LeNet and 1.95x faster for CifarNet compared to the NVIDIA Jetson TX1 solution.

引用

页数：8

共 50 条

[1] Implementation of FPGA-based Accelerator for Deep Neural Networks
Tsai, Tsung-Han
Ho, Yuan-Chen
Sheu, Ming-Hwa
2019 IEEE 22ND INTERNATIONAL SYMPOSIUM ON DESIGN AND DIAGNOSTICS OF ELECTRONIC CIRCUITS & SYSTEMS (DDECS), 2019,
[2] An FPGA-based Accelerator Implementation for Deep Convolutional Neural Networks
Zhou, Yongmei
Jiang, Jingfei
PROCEEDINGS OF 2015 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2015), 2015, : 829 - 832
[3] Development, Implementation and Prospect of FPGA-Based Deep Neural Networks
Jiao, Li-Cheng
Sun, Qi-Gong
Yang, Yu-Ting
Feng, Yu-Xin
Li, Xiu-Fang
Jisuanji Xuebao/Chinese Journal of Computers, 2022, 45 (03): : 441 - 471
[4] Hardware Acceleration of Deep Neural Networks for Autonomous Driving on FPGA-based SoC
Sciangula, Gerlando
Restuccia, Francesco
Biondi, Alessandro
Buttazzo, Giorgio
2022 25TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD), 2022, : 406 - 414
[5] Pipeline ShiftAddNet: An FPGA-Based CNN Implementation With Low Hardware Consumption Targeting Constrained Devices
Kiat, Wei-Pau
Lee, Wai Kong
Tan, Hung-Khoon
Ng, Hui-Fuang
INTERNATIONAL JOURNAL OF CIRCUIT THEORY AND APPLICATIONS, 2025,
[6] A networked FPGA-based hardware implementation of a neural network application
Restrepo, HF
Hoffmann, R
Perez-Uribe, A
Teuscher, C
Sanchez, E
2000 IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, PROCEEDINGS, 2000, : 337 - 338
[7] A FPGA-based Hardware Accelerator for Multiple Convolutional Neural Networks
Yao, Yuchen
Duan, Qinghua
Zhang, Zhiqian
Gao, Jiabao
Wang, Jian
Yang, Meng
Tao, Xinxuan
Lai, Jinmei
2018 14TH IEEE INTERNATIONAL CONFERENCE ON SOLID-STATE AND INTEGRATED CIRCUIT TECHNOLOGY (ICSICT), 2018, : 1075 - 1077
[8] Hardware resource utilization optimization in FPGA-based Heterogeneous MPSoC architectures
Dammak, Bouthaina
Baklouti, Mouna
Benmansour, Rachid
Niar, Smail
Abid, Mohamed
MICROPROCESSORS AND MICROSYSTEMS, 2015, 39 (08) : 1108 - 1118
[9] FPGA-based acceleration for binary neural networks in edge computing
Zhan J.-Y.
Yu A.-T.
Jiang W.
Yang Y.-J.
Xie X.-N.
Chang Z.-W.
Yang J.-H.
Journal of Electronic Science and Technology, 2023, 21 (02)
[10] FPGA-based acceleration for binary neural networks in edge computing
JinYu Zhan
AnTai Yu
Wei Jiang
YongJia Yang
XiaoNa Xie
ZhengWei Chang
JunHuan Yang
Journal of Electronic Science and Technology, 2023, 21 (02) : 67 - 79

← 1 2 3 4 5 →