A Parallelism Strategy Optimization Search Algorithm Based on Three-dimensional Deformable CNN Acceleration Architecture

被引:0
|
作者
Qu Xinyuan
Xu Yu
Huang Zhihong [1 ]
Cai Gang
Fang Zhen
机构
[1] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Field Programmable Gate Array (FPGA); Convolutional Neural Network (CNN); Hardware acceleration;
D O I
10.11999/JEIT210059
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Field Programmable Gate Array (FPGA) is widely used in Convolutional Neural Network (CNN) hardware acceleration. For better performance, a three-dimensional transformable CNN acceleration structure is proposed by Qu et al (2021). However, this structure brings an explosive growth of the parallelism strategy exploration space, thus the time cost to search the optimal parallelism has surged, which reduces severely the feasibility of accelerator implementation. To solve this issue, a fine-grained iterative optimization parallelism search algorithm is proposed in this paper. The algorithm uses multiple rounds of iterative data filtering to eliminate efficiently the redundant parallelism schemes, compressing more than 99% of the search space. At the same time, the algorithm uses pruning operation to delete invalid calculation branches, and reduces successfully the calculation time from 10(6) h to less than 10 s. The algorithm can achieve outstanding performance in different kinds of FPGAs, with an average computing resource utilization (R1, R2) up to (0.957, 0.962).
引用
收藏
页码:1503 / 1512
页数:10
相关论文
共 15 条
  • [1] Angel-Eye: A Complete Design Flow for Mapping CNN Onto Embedded FPGA
    Guo, Kaiyuan
    Sui, Lingzhi
    Qiu, Jiantao
    Yu, Jincheng
    Wang, Junbin
    Yao, Song
    Han, Song
    Wang, Yu
    Yang, Huazhong
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2018, 37 (01) : 35 - 47
  • [2] Krizhevsky A., 2012, COMMUN ACM, V60, P84, DOI [DOI 10.1145/3065386, DOI 10.2165/00129785-200404040-00005]
  • [3] Gradient-based learning applied to document recognition
    Lecun, Y
    Bottou, L
    Bengio, Y
    Haffner, P
    [J]. PROCEEDINGS OF THE IEEE, 1998, 86 (11) : 2278 - 2324
  • [4] A High Performance FPGA-based Accelerator for Large-Scale Convolutional Neural Networks
    Li, Huimin
    Fan, Xitian
    Jiao, Li
    Cao, Wei
    Zhou, Xuegong
    Wang, Lingli
    [J]. 2016 26TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2016,
  • [5] Liu ZQ, 2016, 2016 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT), P61, DOI 10.1109/FPT.2016.7929190
  • [6] A Uniform Architecture Design for Accelerating 2D and 3D CNNs on FPGAs
    Liu, Zhiqiang
    Chow, Paul
    Xu, Jinwei
    Jiang, Jingfei
    Dou, Yong
    Zhou, Jie
    [J]. ELECTRONICS, 2019, 8 (01)
  • [7] Throughput-Optimized FPGA Accelerator for Deep Convolutional Neural Networks
    Liu, Zhiqiang
    Dou, Yong
    Jiang, Jingfei
    Xu, Jinwei
    Li, Shijie
    Zhou, Yongmei
    Xu, Yingnan
    [J]. ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2017, 10 (03)
  • [8] Optimizing the Convolution Operation to Accelerate Deep Neural Networks on FPGA
    Ma, Yufei
    Cao, Yu
    Vrudhula, Sarma
    Seo, Jae-sun
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2018, 26 (07) : 1354 - 1367
  • [9] Going Deeper with Embedded FPGA Platform for Convolutional Neural Network
    Qiu, Jiantao
    Wang, Jie
    Yao, Song
    Guo, Kaiyuan
    Li, Boxun
    Zhou, Erjin
    Yu, Jincheng
    Tang, Tianqi
    Xu, Ningyi
    Song, Sen
    Wang, Yu
    Yang, Huazhong
    [J]. PROCEEDINGS OF THE 2016 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS (FPGA'16), 2016, : 26 - 35
  • [10] Cheetah: An Accurate Assessment Mechanism and a High-Throughput Acceleration Architecture Oriented Toward Resource Efficiency
    Qu, Xinyuan
    Huang, Zhihong
    Xu, Yu
    Mao, Ning
    Cai, Gang
    Fang, Zhen
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2021, 40 (05) : 878 - 891