Optimizing OpenCL Implementation of Deep Convolutional Neural Network on FPGA

被引:2
|
作者
Qiao, Yuran [1 ]
Shen, Junzhong [1 ]
Huang, Dafei [1 ]
Yang, Qianming [1 ]
Wen, Mei [1 ]
Zhang, Chunyuan [1 ]
机构
[1] Natl Univ Def Technol, Coll Comp, Natl Key Lab Parallel & Distributed Proc, Changsha, Hunan, Peoples R China
来源
基金
高等学校博士学科点专项科研基金; 国家高技术研究发展计划(863计划);
关键词
D O I
10.1007/978-3-319-68210-5_9
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Nowadays, the rapid growth of data across the Internet has provided sufficient labeled data to train deep structured artificial neural networks. While deeper structured networks bring about significant precision gains in many applications, they also pose an urgent demand for higher computation capacity at the expense of power consumption. To this end, various FPGA based deep neural network accelerators are proposed for higher performance and lower energy consumption. However, as a dilemma, the development cycle of FPGA application is much longer than that of CPU and GPU. Although FPGA vendors such as Altera and Xilinx have released OpenCL framework to ease the programming, tuning the OpenCL codes for desirable performance on FPGAs is still challenging. In this paper, we look into the OpenCL implementation of Convolutional Neural Network (CNN) on FPGA. By analysing the execution manners of a CPU/GPU oriented verision on FPGA, we find out the causes of performance difference between FPGA and CPU/GPU and locate the performance bottlenecks. According to our analysis, we put forward a corresponding optimization method focusing on external memory transfers. We implement a prototype system on an Altera Stratix V A7 FPGA, which brings a considerable 4.76x speed up to the original version. To the best of our knowledge, this implementation outperforms most of the previous OpenCL implementations on FPGA by a large margin.
引用
收藏
页码:100 / 111
页数:12
相关论文
共 50 条
  • [11] FPGA BASED IMPLEMENTATION OF CONVOLUTIONAL NEURAL NETWORK FOR HYPERSPECTRAL CLASSIFICATION
    Chen, Xiaofeng
    Ji, Jingyu
    Mei, Shaohui
    Zhang, Yifan
    Han, Manli
    Du, Qian
    [J]. IGARSS 2018 - 2018 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2018, : 2451 - 2454
  • [12] FPGA Implementation of Convolutional Neural Network Based on Stochastic Computing
    Kim, Daewoo
    Moghaddam, Mansureh S.
    Moradian, Hossein
    Sim, Hyeonuk
    Lee, Jongeun
    Choi, Kiyoung
    [J]. 2017 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE TECHNOLOGY (ICFPT), 2017, : 287 - 290
  • [13] FPGA-based Convolutional Neural Network Design and Implementation
    Yan, Ruitao
    Yi, Jianjun
    He, Jie
    Zhao, Yifan
    [J]. 2023 3RD ASIA-PACIFIC CONFERENCE ON COMMUNICATIONS TECHNOLOGY AND COMPUTER SCIENCE, ACCTCS, 2023, : 456 - 460
  • [14] Optimizing Convolutional Neural Network Accelerator on Low-Cost FPGA
    Truong Quang Vinh
    Dinh Viet Hai
    [J]. JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2021, 30 (11)
  • [15] A Fast Approach for Deep Neural Network Implementation on FPGA
    Nobari, Maedeh
    Jahanirad, Hadi
    [J]. 2021 29TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2021, : 89 - 93
  • [16] An FPGA-based Accelerator Implementation for Deep Convolutional Neural Networks
    Zhou, Yongmei
    Jiang, Jingfei
    [J]. PROCEEDINGS OF 2015 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2015), 2015, : 829 - 832
  • [17] Optimizing Memory Efficiency for Deep Convolutional Neural Network Accelerators
    Li, Xiaowei
    Li, Jiajun
    Yan, Guihai
    [J]. JOURNAL OF LOW POWER ELECTRONICS, 2018, 14 (04) : 496 - 507
  • [18] Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks
    Ma, Yufei
    Cao, Yu
    Vrudhula, Sarma
    Seo, Jae-sun
    [J]. FPGA'17: PROCEEDINGS OF THE 2017 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS, 2017, : 45 - 54
  • [19] Efficient FPGA Implementation of a Convolutional Neural Network for Radar Signal Processing
    Zhang, Jingchi
    Huang, Yihao
    Yang, Huanrui
    Martinez, Michael
    Hickman, Granger
    Krolik, Jeffrey
    Li, Hai
    [J]. 2021 IEEE 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS), 2021,
  • [20] FPGA Implementation for Odor Identification with Depthwise Separable Convolutional Neural Network
    Mo, Zhuofeng
    Luo, Dehan
    Wen, Tengteng
    Cheng, Yu
    Li, Xin
    [J]. SENSORS, 2021, 21 (03) : 1 - 19