An Uninterrupted Processing Technique-Based High-Throughput and Energy-Efficient Hardware Accelerator for Convolutional Neural Networks

被引:5
|
作者
Islam, Md Najrul [1 ]
Shrestha, Rahul [1 ]
Chowdhury, Shubhajit Roy [1 ]
机构
[1] Indian Inst Technol IIT Mandi, Sch Comp & Elect Engn, Mandi 175075, Himachal Prades, India
关键词
Convolutional neural network (CNN); digital VLSI-architecture design; field-programmable gate array (FPGA); VGG-16 and GoogLeNet neural networks; VLSI; CNN;
D O I
10.1109/TVLSI.2022.3210963
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This article proposes an uninterrupted processing technique for the convolutional neural network (CNN) accelerator. It primarily allows the CNN accelerator to simultaneously perform both processing element (PE) operation and data fetching that reduces its latency and enhances the achievable throughput. Corresponding to the suggested technique, this work also presents a low latency VLSI-architecture of the CNN accelerator using the new random access line-buffer (RALB)-based design of PE array. Subsequently, the proposed CNN-accelerator architecture has been further optimized by reusing the local data in PE array, incurring better energy conservation. Our CNN accelerator has been hardware implemented on Zynq-UltraScale+ MPSoC-ZCU102 FPGA board, and it operates at a maximum clock frequency of 340 MHz, consuming 4.11 W of total power. In addition, the suggested CNN accelerator with 864 PEs delivers a peak throughput of 587.52 GOPs and an adequate energy efficiency of 142.95 GOPs/W. Comparison of aforementioned implementation results with the literature has shown that our CNN accelerator delivers 33.42% higher throughput and 6.24x better energy efficiency than the state-of-the-art work. Eventually, the field-programmable gate array (FPGA) prototype of the proposed CNN accelerator has been functionally validated using the real-world test setup for the detection of object from input image, using the GoogLeNet neural network.
引用
收藏
页码:1891 / 1901
页数:11
相关论文
共 50 条
  • [1] Energy-Efficient and High-Throughput FPGA-based Accelerator for Convolutional Neural Networks
    Feng, Gan
    Hu, Zuyi
    Chen, Song
    Wu, Feng
    2016 13TH IEEE INTERNATIONAL CONFERENCE ON SOLID-STATE AND INTEGRATED CIRCUIT TECHNOLOGY (ICSICT), 2016, : 624 - 626
  • [2] EnGN: A High-Throughput and Energy-Efficient Accelerator for Large Graph Neural Networks
    Liang, Shengwen
    Wang, Ying
    Liu, Cheng
    He, Lei
    Li, Huawei
    Xu, Dawen
    Li, Xiaowei
    IEEE TRANSACTIONS ON COMPUTERS, 2021, 70 (09) : 1511 - 1525
  • [3] AN ENERGY-EFFICIENT MEMORY-BASED HIGH-THROUGHPUT VLSI ARCHITECTURE FOR CONVOLUTIONAL NETWORKS
    Kang, Mingu
    Gonugondla, Sujan K.
    Keel, Min-Sun
    Shanbhag, Naresh R.
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 1037 - 1041
  • [4] FireFly: A High-Throughput Hardware Accelerator for Spiking Neural Networks With Efficient DSP and Memory Optimization
    Li, Jindong
    Shen, Guobin
    Zhao, Dongcheng
    Zhang, Qian
    Zeng, Yi
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2023, 31 (08) : 1178 - 1191
  • [5] High-throughput, energy-efficient network-on-chip-based hardware accelerators
    Majumder, Turbo
    Pande, Partha Pratim
    Kalyanaraman, Ananth
    SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMS, 2013, 3 (01): : 36 - 46
  • [6] Hardware Design of an Energy-Efficient High-Throughput Median Filter
    Lin, Shih-Hsiang
    Chen, Pei-Yin
    Lin, Chang-Hsing
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2018, 65 (11) : 1728 - 1732
  • [7] High-throughput and Energy-efficient Graph Processing on FPGA
    Zhou, Shijie
    Chelmis, Charalampos
    Prasanna, Viktor K.
    2016 IEEE 24TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2016, : 103 - 110
  • [8] PNeuro: a scalable energy-efficient programmable hardware accelerator for neural networks
    Carbon, A.
    Philippe, J-M.
    Bichler, O.
    Schmit, R.
    Tain, B.
    Briand, D.
    Ventroux, N.
    Paindavoine, M.
    Brousse, O.
    PROCEEDINGS OF THE 2018 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2018, : 1039 - 1044
  • [9] Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks
    Chen, Yu-Hsin
    Krishna, Tushar
    Emer, Joel
    Sze, Vivienne
    2016 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE (ISSCC), 2016, 59 : 262 - U363
  • [10] Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks
    Chen, Yu-Hsin
    Krishna, Tushar
    Emer, Joel S.
    Sze, Vivienne
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2017, 52 (01) : 127 - 138