An Uninterrupted Processing Technique-Based High-Throughput and Energy-Efficient Hardware Accelerator for Convolutional Neural Networks

被引：5

作者：

Islam, Md Najrul ^{[1
]}

Shrestha, Rahul ^{[1
]}

Chowdhury, Shubhajit Roy ^{[1
]}

机构：

[1] Indian Inst Technol IIT Mandi, Sch Comp & Elect Engn, Mandi 175075, Himachal Prades, India

来源：

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS | 2022年 / 30卷 / 12期

关键词：

Convolutional neural network (CNN); digital VLSI-architecture design; field-programmable gate array (FPGA); VGG-16 and GoogLeNet neural networks; VLSI; CNN;

D O I：

10.1109/TVLSI.2022.3210963

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

This article proposes an uninterrupted processing technique for the convolutional neural network (CNN) accelerator. It primarily allows the CNN accelerator to simultaneously perform both processing element (PE) operation and data fetching that reduces its latency and enhances the achievable throughput. Corresponding to the suggested technique, this work also presents a low latency VLSI-architecture of the CNN accelerator using the new random access line-buffer (RALB)-based design of PE array. Subsequently, the proposed CNN-accelerator architecture has been further optimized by reusing the local data in PE array, incurring better energy conservation. Our CNN accelerator has been hardware implemented on Zynq-UltraScale+ MPSoC-ZCU102 FPGA board, and it operates at a maximum clock frequency of 340 MHz, consuming 4.11 W of total power. In addition, the suggested CNN accelerator with 864 PEs delivers a peak throughput of 587.52 GOPs and an adequate energy efficiency of 142.95 GOPs/W. Comparison of aforementioned implementation results with the literature has shown that our CNN accelerator delivers 33.42% higher throughput and 6.24x better energy efficiency than the state-of-the-art work. Eventually, the field-programmable gate array (FPGA) prototype of the proposed CNN accelerator has been functionally validated using the real-world test setup for the detection of object from input image, using the GoogLeNet neural network.

引用

页码：1891 / 1901

页数：11

共 50 条

[41] A FPGA-based Hardware Accelerator for Multiple Convolutional Neural Networks
Yao, Yuchen
Duan, Qinghua
Zhang, Zhiqian
Gao, Jiabao
Wang, Jian
Yang, Meng
Tao, Xinxuan
Lai, Jinmei
2018 14TH IEEE INTERNATIONAL CONFERENCE ON SOLID-STATE AND INTEGRATED CIRCUIT TECHNOLOGY (ICSICT), 2018, : 1075 - 1077
[42] Design of an Energy-Efficient Accelerator for Training of Convolutional Neural Networks using Frequency-Domain Computation
Ko, Jong Hwan
Mudassar, Burhan
Na, Taesik
Mukhopadhyay, Saibal
PROCEEDINGS OF THE 2017 54TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2017,
[43] A High-Throughput Reconfigurable Processing Array for Neural Networks
Wu, Ephrem
Zhang, Xiaoqian
Berman, David
Cho, Inkeun
2017 27TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2017,
[44] Energy-Efficient Design of Processing Element for Convolutional Neural Network
Choi, Yeongjae
Bae, Dongmyung
Sim, Jaehyeong
Choi, Seungkyu
Kim, Minhye
Kim, Lee-Sup
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2017, 64 (11) : 1332 - 1336
[45] Design framework for an energy-efficient binary convolutional neural network accelerator based on nonvolatile logic
Suzuki, Daisuke
Oka, Takahiro
Tamakoshi, Akira
Takako, Yasuhiro
Hanyu, Takahiro
IEICE NONLINEAR THEORY AND ITS APPLICATIONS, 2021, 12 (04): : 695 - 710
[46] Ripple: High-throughput, Reliable and Energy-efficient Network Flooding in Wireless Sensor Networks
Yuan, Dingwen
Hollick, Matthias
2015 IEEE 16TH INTERNATIONAL SYMPOSIUM ON A WORLD OF WIRELESS, MOBILE AND MULTIMEDIA NETWORKS (WOWMOM), 2015,
[47] Energy-Efficient Bit-Sparse Accelerator Design for Convolutional Neural Network
Xiao, Hang
Xu, Haobo
Wang, Ying
Li, Jiajun
Wang, Yujie
Han, Yinhe
Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2023, 35 (07): : 1122 - 1131
[48] SRAM Voltage Scaling for Energy-Efficient Convolutional Neural Networks
Yang, Lita
Murmann, Boris
PROCEEDINGS OF THE EIGHTEENTH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED), 2017, : 7 - 12
[49] An Energy-Efficient Architecture for Binary Weight Convolutional Neural Networks
Wang, Yizhi
Lin, Jun
Wang, Zhongfeng
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2018, 26 (02) : 280 - 293
[50] A Flexible and Energy-Efficient Convolutional Neural Network Acceleration With Dedicated ISA and Accelerator
Chen, Xiaobai
Yu, Zhiyi
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2018, 26 (07) : 1408 - 1412

← 1 2 3 4 5 →