An FPGA-Based CNN Accelerator Integrating Depthwise Separable Convolution

被引:66
|
作者
Liu, Bing [1 ]
Zou, Danyin [1 ]
Feng, Lei [1 ]
Feng, Shou [1 ]
Fu, Ping [1 ]
Li, Junbao [1 ]
机构
[1] Harbin Inst Technol, Sch Elect & Informat Engn, Harbin 150001, Heilongjiang, Peoples R China
基金
中国国家自然科学基金;
关键词
convolutional neural network (CNN); field programmable gate array (FPGA); depthwise separable convolution; accelerator; COPROCESSOR;
D O I
10.3390/electronics8030281
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The Convolutional Neural Network (CNN) has been used in many fields and has achieved remarkable results, such as image classification, face detection, and speech recognition. Compared to GPU (graphics processing unit) and ASIC, a FPGA (field programmable gate array)-based CNN accelerator has great advantages due to its low power consumption and reconfigurable property. However, FPGA's extremely limited resources and CNN's huge amount of parameters and computational complexity pose great challenges to the design. Based on the ZYNQ heterogeneous platform and the coordination of resource and bandwidth issues with the roofline model, the CNN accelerator we designed can accelerate both standard convolution and depthwise separable convolution with a high hardware resource rate. The accelerator can handle network layers of different scales through parameter configuration and maximizes bandwidth and achieves full pipelined by using a data stream interface and ping-pong on-chip cache. The experimental results show that the accelerator designed in this paper can achieve 17.11GOPS for 32bit floating point when it can also accelerate depthwise separable convolution, which has obvious advantages compared with other designs.
引用
收藏
页数:18
相关论文
共 50 条
  • [41] Flare: An FPGA-Based Full Precision Low Power CNN Accelerator with Reconfigurable Structure
    Xu, Yuhua
    Luo, Jie
    Sun, Wei
    SENSORS, 2024, 24 (07)
  • [42] Exploration of Memory Access Optimization for FPGA-based 3D CNN Accelerator
    Tian, Teng
    Jin, Xi
    Zhao, Letian
    Wang, Xiaotian
    Wang, Jie
    Wu, Wei
    PROCEEDINGS OF THE 2020 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2020), 2020, : 1650 - 1655
  • [43] An Efficient Method for DPM Code Localization Based on Depthwise Separable Convolution
    Li, Yusheng
    Tian, Yong
    Tian, Jindong
    Zhou, Fei
    IEEE ACCESS, 2019, 7 : 42014 - 42023
  • [44] MDWConv:CNN based on multi-scale atrous pyramid and depthwise separable convolution for long time series forecasting
    Tian, Guangpo
    Xu, Yunyang
    Ma, Xiang
    Li, Xuemei
    Zhang, Caiming
    NEURAL NETWORKS, 2025, 185
  • [45] Recognition of Crop Diseases Based on Depthwise Separable Convolution in Edge Computing
    Gu, Musong
    Li, Kuan-Ching
    Li, Zhongwen
    Han, Qiyi
    Fan, Wenjie
    SENSORS, 2020, 20 (15) : 1 - 16
  • [46] Traffic flow prediction based on depthwise separable convolution fusion network
    Yu, Yue
    Sun, Wei
    Liu, Jianhua
    Zhang, Changfan
    JOURNAL OF BIG DATA, 2022, 9 (01)
  • [47] An improved architecture for urban building extraction based on depthwise separable convolution
    Zhang, Xiaoqing
    Zheng, Yongguo
    Liu, Weike
    Peng, Yanjun
    Wang, Zhiyong
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 38 (05) : 5821 - 5829
  • [48] DIMA: A Depthwise CNN In-Memory Accelerator
    Angizi, Shaahin
    He, Zhezhi
    Fan, Deliang
    2018 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD) DIGEST OF TECHNICAL PAPERS, 2018,
  • [49] Lightweight Residual Network Based on Depthwise Separable Convolution for Hyperspectral Image Classification
    Cheng Rongjie
    Yang Yun
    Li Longwei
    Wang Yanting
    Wang Jiayu
    ACTA OPTICA SINICA, 2023, 43 (12)
  • [50] Fire Detection Method Based on Depthwise Separable Convolution and YOLOv3
    Yue-Yan Qin
    Jiang-Tao Cao
    Xiao-Fei Ji
    International Journal of Automation and Computing, 2021, 18 : 300 - 310