An FPGA-Based CNN Accelerator Integrating Depthwise Separable Convolution

被引:66
|
作者
Liu, Bing [1 ]
Zou, Danyin [1 ]
Feng, Lei [1 ]
Feng, Shou [1 ]
Fu, Ping [1 ]
Li, Junbao [1 ]
机构
[1] Harbin Inst Technol, Sch Elect & Informat Engn, Harbin 150001, Heilongjiang, Peoples R China
基金
中国国家自然科学基金;
关键词
convolutional neural network (CNN); field programmable gate array (FPGA); depthwise separable convolution; accelerator; COPROCESSOR;
D O I
10.3390/electronics8030281
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The Convolutional Neural Network (CNN) has been used in many fields and has achieved remarkable results, such as image classification, face detection, and speech recognition. Compared to GPU (graphics processing unit) and ASIC, a FPGA (field programmable gate array)-based CNN accelerator has great advantages due to its low power consumption and reconfigurable property. However, FPGA's extremely limited resources and CNN's huge amount of parameters and computational complexity pose great challenges to the design. Based on the ZYNQ heterogeneous platform and the coordination of resource and bandwidth issues with the roofline model, the CNN accelerator we designed can accelerate both standard convolution and depthwise separable convolution with a high hardware resource rate. The accelerator can handle network layers of different scales through parameter configuration and maximizes bandwidth and achieves full pipelined by using a data stream interface and ping-pong on-chip cache. The experimental results show that the accelerator designed in this paper can achieve 17.11GOPS for 32bit floating point when it can also accelerate depthwise separable convolution, which has obvious advantages compared with other designs.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] A CNN Accelerator on FPGA Using Depthwise Separable Convolution
    Bai, Lin
    Zhao, Yiming
    Huang, Xinming
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2018, 65 (10) : 1415 - 1419
  • [2] A High-Performance FPGA-Based Depthwise Separable Convolution Accelerator
    Huang, Jiye
    Liu, Xin
    Guo, Tongdong
    Zhao, Zhijin
    ELECTRONICS, 2023, 12 (07)
  • [3] A Depthwise Separable Convolution Architecture for CNN Accelerator
    Srivastava, Harsh
    Sarawadekar, Kishor
    PROCEEDINGS OF 2020 IEEE APPLIED SIGNAL PROCESSING CONFERENCE (ASPCON 2020), 2020, : 1 - 5
  • [4] An FPGA-Based Energy-Efficient Reconfigurable Depthwise Separable Convolution Accelerator for Image Recognition
    Xuan, Lei
    Un, Ka-Fai
    Lam, Chi-Seng
    Martins, Rui P.
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2022, 69 (10) : 4003 - 4007
  • [5] An FPGA-Based Approach for Compressing and Accelerating Depthwise Separable Convolution
    Yang, Ruiheng
    Chen, Zhikun
    Hu, Lingtong
    Cui, Xihang
    Guo, Yunfei
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 2590 - 2594
  • [6] An Efficient FPGA-based Depthwise Separable Convolutional Neural Network Accelerator with Hardware Pruning
    Liu, Zhengyan
    Liu, Qiang
    Yan, Shun
    Cheung, Ray C. C.
    ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2024, 17 (01)
  • [7] The Data Flow and Architectural Optimizations for a Highly Efficient CNN Accelerator Based on the Depthwise Separable Convolution
    Lin, Hung-Ju
    Shen, Chung-An
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2022, 41 (06) : 3547 - 3569
  • [8] The Data Flow and Architectural Optimizations for a Highly Efficient CNN Accelerator Based on the Depthwise Separable Convolution
    Hung-Ju Lin
    Chung-An Shen
    Circuits, Systems, and Signal Processing, 2022, 41 : 3547 - 3569
  • [9] MLogNet: A Logarithmic Quantization-Based Accelerator for Depthwise Separable Convolution
    Choi, Jooyeon
    Sim, Hyeonuk
    Oh, Sangyun
    Lee, Sugil
    Lee, Jongeun
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (12) : 5220 - 5231
  • [10] An Efficient FPGA-Based Accelerator Design for Convolution
    Song, Peng-Fei
    Pan, Jeng-Shyang
    Yang, Chun-Sheng
    Lee, Chiou-Yng
    2017 IEEE 8TH INTERNATIONAL CONFERENCE ON AWARENESS SCIENCE AND TECHNOLOGY (ICAST), 2017, : 494 - 500