Energy -Efficient CNNs Accelerator Implementation on FPGA with Optimized Storage and Dataflow

被引:0
|
作者
Zhang, Yonghua [1 ]
Jiang, Hongxu [1 ]
Li, Xiaobin [1 ]
Miao, Rui [1 ]
Nie, Jinyan [2 ]
Du, Yu [3 ]
机构
[1] Beihang Univ, Hangzhou Innovat Inst, Beijing Key Lab Digital Media, Beijing, Peoples R China
[2] Beihang Univ, Beijing, Peoples R China
[3] Beijing Union Univ, Beijing, Peoples R China
关键词
CNNs Accelerator; FPGA; Energy-Efficient; Storage; Dataflow; DEEP NEURAL-NETWORKS;
D O I
10.1109/ISPA-BDCloud-SocialCom-SustainCom52081.2021.00166
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Convolutional neural networks (CNNs) has been widely used in computer vision and speech processing, and has achieved great success. However, the deployment of large-scale CNN model is limited by computing and memory in the smart embedded system. Through the current high parallel computing paradigm of CNNs accelerator, the computing requirements can be effectively met to achieve high throughput. However, because the communication cost may be higher than the computing cost for small smart platform, the energy consumption is still very high. In order to solve this problem, a new CNNs accelerator based on storage and data flow is proposed, which realizes energy-saving CNNs inference acceleration by minimizing data access and maximizing data reuse. This paper implements the accelerator on the Zynq UltraScale+MPSoC ZCU102 evaluation board, and evaluates the throughput and energy efficiency of the accelerator for typical vgg16 and tiny Yolo benchmark networks. Compared with other accelerators, the our accelerator improves the system energy efficiency by 6.3x, the system throughput by 41x, and the throughput of a single DSP by 7.63x.
引用
收藏
页码:1209 / 1214
页数:6
相关论文
共 50 条
  • [1] High-efficient MPSoC-based CNNs accelerator with optimized storage and dataflow
    Zhang, Yonghua
    Jiang, Hongxu
    Liu, Xiaojian
    Cao, Haiheng
    Du, Yu
    [J]. JOURNAL OF SUPERCOMPUTING, 2022, 78 (03): : 3205 - 3225
  • [2] High-efficient MPSoC-based CNNs accelerator with optimized storage and dataflow
    Yonghua Zhang
    Hongxu Jiang
    Xiaojian Liu
    Haiheng Cao
    Yu Du
    [J]. The Journal of Supercomputing, 2022, 78 : 3205 - 3225
  • [3] An Efficient FPGA Accelerator Design for Optimized CNNs Using OpenCL
    Vemparala, Manoj Rohit
    Frickenstein, Alexander
    Stechele, Walter
    [J]. ARCHITECTURE OF COMPUTING SYSTEMS - ARCS 2019, 2019, 11479 : 236 - 249
  • [4] An Energy-Efficient Accelerator Architecture with Serial Accumulation Dataflow for Deep CNNs
    Ahmadi, Mehdi
    Vakili, Shervin
    Langlois, J. M. Pierre
    [J]. 2020 18TH IEEE INTERNATIONAL NEW CIRCUITS AND SYSTEMS CONFERENCE (NEWCAS'20), 2020, : 214 - 217
  • [5] An Efficient Sparse CNNs Accelerator on FPGA
    Zhang, Yonghua
    Jiang, Hongxu
    Li, Xiaobin
    Wang, Haojie
    Dong, Dong
    Cao, Yongxiang
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER 2022), 2022, : 504 - 505
  • [6] An Energy-Efficient Implementation of Group Pruned CNNs on FPGA
    Pang, Wei
    Wu, Chenglu
    Lu, Shengli
    [J]. IEEE ACCESS, 2020, 8 : 217033 - 217044
  • [7] Memory-Efficient Dataflow Inference for Deep CNNs on FPGA
    Petrica, Lucian
    Alonso, Tobias
    Kroes, Mairin
    Fraser, Nicholas
    Cotofana, Sorin
    Blott, Michaela
    [J]. 2020 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (ICFPT 2020), 2020, : 48 - 55
  • [8] A Systolic Dataflow Based Accelerator for CNNs
    Das, Saptarsi
    Roy, Arnab
    Chandrasekharan, Kiran Kolar
    Deshwal, Ankur
    Lee, Sehwan
    [J]. 2020 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2020,
  • [9] OctCNN: An Energy-Efficient FPGA Accelerator for CNNs using Octave Convolution Algorithm
    Lou, Wenqi
    Wang, Chao
    Gong, Lei
    Zhou, Xuehai
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER 2020), 2020, : 410 - 411
  • [10] Energy and Bandwidth Efficient Sparse Programmable Dataflow Accelerator
    Schneider, Felix
    Karagounis, Michael
    Choubey, Bhaskar
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2024, 71 (09) : 4092 - 4105