An Uninterrupted Processing Technique-Based High-Throughput and Energy-Efficient Hardware Accelerator for Convolutional Neural Networks

被引:5
|
作者
Islam, Md Najrul [1 ]
Shrestha, Rahul [1 ]
Chowdhury, Shubhajit Roy [1 ]
机构
[1] Indian Inst Technol IIT Mandi, Sch Comp & Elect Engn, Mandi 175075, Himachal Prades, India
关键词
Convolutional neural network (CNN); digital VLSI-architecture design; field-programmable gate array (FPGA); VGG-16 and GoogLeNet neural networks; VLSI; CNN;
D O I
10.1109/TVLSI.2022.3210963
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This article proposes an uninterrupted processing technique for the convolutional neural network (CNN) accelerator. It primarily allows the CNN accelerator to simultaneously perform both processing element (PE) operation and data fetching that reduces its latency and enhances the achievable throughput. Corresponding to the suggested technique, this work also presents a low latency VLSI-architecture of the CNN accelerator using the new random access line-buffer (RALB)-based design of PE array. Subsequently, the proposed CNN-accelerator architecture has been further optimized by reusing the local data in PE array, incurring better energy conservation. Our CNN accelerator has been hardware implemented on Zynq-UltraScale+ MPSoC-ZCU102 FPGA board, and it operates at a maximum clock frequency of 340 MHz, consuming 4.11 W of total power. In addition, the suggested CNN accelerator with 864 PEs delivers a peak throughput of 587.52 GOPs and an adequate energy efficiency of 142.95 GOPs/W. Comparison of aforementioned implementation results with the literature has shown that our CNN accelerator delivers 33.42% higher throughput and 6.24x better energy efficiency than the state-of-the-art work. Eventually, the field-programmable gate array (FPGA) prototype of the proposed CNN accelerator has been functionally validated using the real-world test setup for the detection of object from input image, using the GoogLeNet neural network.
引用
收藏
页码:1891 / 1901
页数:11
相关论文
共 50 条
  • [31] High-Throughput Multichannel Parallelized Diffraction Convolutional Neural Network Accelerator
    Hu, Zibo
    Li, Shurui
    Schwartz, Russell L. T.
    Solyanik-Gorgone, Maria
    Miscuglio, Mario
    Gupta, Puneet
    Sorger, Volker J.
    LASER & PHOTONICS REVIEWS, 2022, 16 (12)
  • [32] SYNTHNET: A High-throughput yet Energy-efficient Combinational Logic Neural Network
    Chen, Tianen
    Kemp, Taylor
    Kim, Younghyun
    27TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2022, 2022, : 232 - 237
  • [33] An energy-efficient convolutional neural network accelerator for speech classification based on FPGA and quantization
    Wen, Dong
    Jiang, Jingfei
    Dou, Yong
    Xu, Jinwei
    Xiao, Tao
    CCF TRANSACTIONS ON HIGH PERFORMANCE COMPUTING, 2021, 3 (01) : 4 - 16
  • [34] An energy-efficient convolutional neural network accelerator for speech classification based on FPGA and quantization
    Dong Wen
    Jingfei Jiang
    Yong Dou
    Jinwei Xu
    Tao Xiao
    CCF Transactions on High Performance Computing, 2021, 3 : 4 - 16
  • [35] An Efficient FIFO Based Accelerator for Convolutional Neural Networks
    Panchbhaiyye, Vineet
    Ogunfunmi, Tokunbo
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2021, 93 (10): : 1117 - 1129
  • [36] An Efficient FIFO Based Accelerator for Convolutional Neural Networks
    Vineet Panchbhaiyye
    Tokunbo Ogunfunmi
    Journal of Signal Processing Systems, 2021, 93 : 1117 - 1129
  • [37] Quantization and sparsity-aware processing for energy-efficient NVM-based convolutional neural networks
    Bao, Han
    Qin, Yifan
    Chen, Jia
    Yang, Ling
    Li, Jiancong
    Zhou, Houji
    Li, Yi
    Miao, Xiangshui
    FRONTIERS IN ELECTRONICS, 2022, 3
  • [38] A Precision-Scalable Energy-Efficient Convolutional Neural Network Accelerator
    Liu, Wenjian
    Lin, Jun
    Wang, Zhongfeng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2020, 67 (10) : 3484 - 3497
  • [39] Energy-Efficient and High-Throughput Nanophotonic Neuromorphic Computing
    Nazirzadeh, Mohammadamin
    Shamsabardeh, Mohammadsadegh
    Ben Yoo, S. J.
    2018 CONFERENCE ON LASERS AND ELECTRO-OPTICS (CLEO), 2018,
  • [40] SHARP: An Adaptable, Energy-Efficient Accelerator for Recurrent Neural Networks
    Aminabadi, Reza Yazdani
    Ruwase, Olatunji
    Zhang, Minjia
    He, Yuxiong
    Arnau, Jose-Maria
    Gonzalez, Antonio
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2023, 22 (02)