A Threshold Neuron Pruning for a Binarized Deep Neural Network on an FPGA

被引：14

作者：

Fujii, Tomoya ^{[1
]}

Sato, Shimpei ^{[1
]}

Nakahara, Hiroki ^{[1
]}

机构：

[1] Tokyo Inst Technol, Dept Informat & Commun Engn, Tokyo 1528552, Japan

来源：

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS | 2018年 / E101D卷 / 02期

关键词：

machine learning; deep learning; pruning; FPGA;

D O I：

10.1587/transinf.2017RCP0013

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

For a pre-trained deep convolutional neural network (CNN) for an embedded system, a high-speed and a low power consumption are required. In the former of the CNN, it consists of convolutional layers, while in the latter, it consists of fully connection layers. In the convolutional layer, the multiply accumulation operation is a bottleneck, while the fully connection layer, the memory access is a bottleneck. The binarized CNN has been proposed to realize many multiply accumulation circuit on the FPGA, thus, the convolutional layer can be done with a high-seed operation. However, even if we apply the binarization to the fully connection layer, the amount of memory was still a bottleneck. In this paper, we propose a neuron pruning technique which eliminates almost part of the weight memory, and we apply it to the fully connection layer on the binarized CNN. In that case, since the weight memory is realized by an on-chip memory on the FPGA, it achieves a high-speed memory access. To further reduce the memory size, we apply the retraining the CNN after neuron pruning. In this paper, we propose a sequential-input parallel-output fully connection layer circuit for the binarized fully connection layer, while proposing a streaming circuit for the binarized 2D convolutional layer. The experimental results showed that, by the neuron pruning, as for the fully connected layer on the VGG-11 CNN, the number of neurons was reduced by 39.8% with keeping the 99% baseline accuracy. We implemented the neuron pruning CNN on the Xilinx Inc. Zynq Zedboard. Compared with the ARM Cortex-A57, it was 1773.0 times faster, it dissipated 3.1 times lower power, and its performance per power efficiency was 5781.3 times better. Also, compared with the Maxwell GPU, it was 11.1 times faster, it dissipated 7.7 times lower power, and its performance per power efficiency was 84.1 times better. Thus, the binarized CNN on the FPGA is suitable for the embedded system.

引用

页码：376 / 386

页数：11

共 50 条

[1] An FPGA Realization of a Deep Convolutional Neural Network Using a Threshold Neuron Pruning
Fujii, Tomoya
Sato, Simpei
Nakahara, Hiroki
Motomura, Masato
[J]. APPLIED RECONFIGURABLE COMPUTING, 2017, 10216 : 268 - 280
[2] A Batch Normalization Free Binarized Convolutional Deep Neural Network on an FPGA
Nakahara, Hiroki
Yonekawa, Haruyoshi
Iwamoto, Hisashi
Motomura, Masato
[J]. FPGA'17: PROCEEDINGS OF THE 2017 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS, 2017, : 290 - 290
[3] An FSCV Deep Neural Network: Development, Pruning, and Acceleration on an FPGA
Zhang, Zhichao
Oh, Yoonbae
Adams, Scott D.
Bennet, Kevin E.
Kouzani, Abbas Z.
[J]. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2021, 25 (06) : 2248 - 2259
[4] FP-BNN: Binarized neural network on FPGA
Liang, Shuang
Yin, Shouyi
Liu, Leibo
Luk, Wayne
Wei, Shaojun
[J]. NEUROCOMPUTING, 2018, 275 : 1072 - 1086
[5] Binarized Depthwise Separable Neural Network for Object Tracking in FPGA
Yang, Li
He, Zhezhi
Fan, Deliang
[J]. GLSVLSI '19 - PROCEEDINGS OF THE 2019 ON GREAT LAKES SYMPOSIUM ON VLSI, 2019, : 347 - 350
[6] All Binarized Convolutional Neural Network and Its implementation on an FPGA
Shimoda, Masayuki
Sato, Shimpei
Nakahara, Hiroki
[J]. 2017 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE TECHNOLOGY (ICFPT), 2017, : 291 - 294
[7] Implementing Binarized Neural Network Processor on FPGA-Based Platform
Lee, Jeahack
Kim, Hyeonseong
Kim, Byung-Soo
Jeon, Seokhun
Lee, Jung Chul
Kim, Dong Sun
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2022): INTELLIGENT TECHNOLOGY IN THE POST-PANDEMIC ERA, 2022, : 469 - 471
[8] FPGA based Implementation of Binarized Neural Network for Sign Language Application
Jaiswal, Mohita
Sharma, Vaidehi
Sharma, Abhishek
Saini, Sandeep
Tomar, Raghuvir
[J]. 2021 IEEE INTERNATIONAL SYMPOSIUM ON SMART ELECTRONIC SYSTEMS (ISES 2021), 2021, : 303 - 306
[9] A Fully Connected Layer Elimination for a Binarized Convolutional Neural Network on an FPGA
Nakahara, Hiroki
Fujii, Tomoya
Sato, Shimpei
[J]. 2017 27TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2017,
[10] Pruning by explaining: A novel criterion for deep neural network pruning
Yeom, Seul-Ki
Seegerer, Philipp
Lapuschkin, Sebastian
Binder, Alexander
Wiedemann, Simon
Mueller, Klaus-Robert
Samek, Wojciech
[J]. PATTERN RECOGNITION, 2021, 115

← 1 2 3 4 5 →