Design Flow of Accelerating Hybrid Extremely Low Bit-width Neural Network in Embedded FPGA

被引：53

作者：

Wang, Junsong ^{[1
]}

Lou, Qiuwen ^{[2
]}

Zhang, Xiaofan ^{[3
]}

Zhu, Chao ^{[1
]}

Lin, Yonghua ^{[1
]}

Chen, Deming ^{[3
]}

机构：

[1] IBM Res China, Beijing, Peoples R China

[2] Univ Notre Dame, Notre Dame, IN 46556 USA

[3] Univ Illinois, Champaign, IL USA

来源：

2018 28TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL) | 2018年

关键词：

D O I：

10.1109/FPL.2018.00035

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Neural network accelerators with low latency and low energy consumption are desirable for edge computing. To create such accelerators, we propose a design flow for accelerating the extremely low bit-width neural network (ELB-NN) in embedded FPGAs with hybrid quantization schemes. This flow covers both network training and FPGA-based network deployment, which facilitates the design space exploration and simplifies the tradeoff between network accuracy and computation efficiency. Using this flow helps hardware designers to deliver a network accelerator in edge devices under strict resource and power constraints. We present the proposed flow by supporting hybrid ELB settings within a neural network. Results show that our design can deliver very high performance peaking at 103 TOPS and classify up to 325.3 image/s/watt while running large-scale neural networks for less than 5W using embedded FPGA. To the best of our knowledge, it is the most energy efficient solution in comparison to GPU or other FPGA implementations reported so far in the literature.

引用

页码：163 / 169

页数：7

共 41 条

[21] Low-power optimization by smart bit-width allocation in a SystemC-based ASIC design environment
Mallik, Arindam
Sinha, Debjit
Banerjee, Prithviraj
Zhou, Hai
[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2007, 26 (03) : 447 - 455
[22] Accelerating the neural network controller embedded implementation on FPGA with novel dropout techniques for a solar inverter
Sturtz, Jordan
Surendranath, Kushal Kalyan Devalampeta
Sam, Maxwell
Fu, Xingang
Hingu, Chanakya Dinesh
Challoo, Rajab
Qingge, Letu
[J]. PERVASIVE AND MOBILE COMPUTING, 2024, 104
[23] Design of High Performance Convolutional Neural Network Accelerator for Embedded FPGA
Zeng, Chenglong
Liu, Qiang
[J]. Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2019, 31 (09): : 1645 - 1652
[24] Low-Power Design Methodology of Voltage Over-Scalable Circuit with Critical Path Isolation and Bit-Width Scaling
Masuda, Yutaka
Nagayama, Jun
Cheng, TaiYu
Ishihara, Tohru
Momiyama, Yoichi
Hashimoto, Masanori
[J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2022, E105A (03) : 509 - 517
[25] Minimalist Design for Accelerating Convolutional Neural Networks for Low-end FPGA platforms
Morcel, Raghid
Akkary, Haitham
Hajj, Hazem
Saghir, Mazen
Keshavamurthy, Anil
Khanna, Rahul
Artail, Hassan
[J]. 2017 IEEE 25TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2017), 2017, : 196 - 196
[26] Hybrid neural network design and implementation on FPGA for infant cry recognition
Suaste-Rivas, Israel
Diaz-Mendez, Alejandro
Reyes-Garcia, Carlos A.
Reyes-Galaviz, Orion F.
[J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2006, 4188 : 703 - 709
[27] An FPGA-based Hybrid Neural Network accelerator for embedded satellite image classification
Lemaire, Edgar
Moretti, Matthieu
Daniel, Lionel
Miramond, Benoit
Millet, Philippe
Feresin, Frederic
Bilavarn, Sebastien
[J]. 2020 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2020,
[28] Extremely Low-bit Convolution Optimization for Quantized Neural Network on Modern Computer Architectures
Han, Qingchang
Hu, Yongmin
Yu, Fengwei
Yang, Hailong
Liu, Bing
Hu, Peng
Gong, Ruihao
Wang, Yanfei
Wang, Rui
Luan, Zhongzhi
Qian, Depei
[J]. PROCEEDINGS OF THE 49TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2020, 2020,
[29] Performance Analysis of Bit-Width Reduced Floating-Point Arithmetic Units in FPGAs: A Case Study of Neural Network-Based Face Detector
Lee, Yongsoon
Choi, Younhee
Ko, Seok-Bum
Lee, Moon Ho
[J]. EURASIP JOURNAL ON EMBEDDED SYSTEMS, 2009, (01)
[30] Overflow Aware Quantization: Accelerating Neural Network Inference by Low-bit Multiply-Accumulate Operations
Xie, Hongwei
Song, Yafei
Cai, Ling
Li, Mingyang
[J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 868 - 875

← 1 2 3 4 5 →