Toward Efficient Co-Design of CNN Quantization and HW Architecture on FPGA Hybrid-Accelerator

被引:0
|
作者
Zhang, Yiran [1 ]
Li, Guiying [1 ]
Yuan, Bo [1 ]
机构
[1] Southern Univ Sci & Technol, Guangdong Prov Key Lab Brain Inspired Intelligent, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
CNN accelerator; FPGA; DSE method;
D O I
10.1109/SEDA62518.2024.10617620
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Field programmable gate array (FPGA) has emerged as a promising platform for accelerating convolutional neural networks (CNNs). In this paper, we propose a low-latency CNN hybrid-accelerator system and an efficient design space exploration (DSE) method. Specifically, our targeted FPGA platform consists of different types of accelerators for two advantages: high concurrency and full hardware utilization (i.e., lookup tables (LUTs) and digital signal processors (DSPs)). Besides, we adopt a bandwidth-aware analytical model for system latency to consider pipeline stalls and computation cycles simultaneously. Furthermore, for the huge design space encompassing layer-wise CNN quantization and FPGA hybrid-accelerator architecture, we propose a DSE method (named DiMEGA) aimed at enhancing search efficiency, which is a differentiable method embedded by a genetic algorithm. The performance of our CNN hybrid-accelerator system is demonstrated on a PYNQ-Z2 FPGA platform. The experimental results show that the system latency can be reduced by 42% similar to 48% without sacrificing accuracy, and the DSE time of DiMEGA is reduced by 23% on ResNet20-CIFAR10, and 63% on ResNet56-CIFAR10, compared with SOTA.
引用
收藏
页码:678 / 683
页数:6
相关论文
共 50 条
  • [41] Profiling and SW/HW Co-design for Efficient SDN/OpenFlow Data Plane Realization
    Wang, Ching-Che
    Chen, Yi-Ta
    Lee, Ding-Yuan
    Kao, Sheng-Chun
    Wu, An-Yeu
    PROCEEDINGS OF 2017 IEEE 7TH INTERNATIONAL CONFERENCE ON ELECTRONICS INFORMATION AND EMERGENCY COMMUNICATION (ICEIEC), 2017, : 438 - 443
  • [42] Design of Extended RISC-V for Q-Learning Hardware Accelerator using HW/SW Co-Design
    Syafalni, Infall
    Mazaya, Muhammad Sulthan
    Elfazri, Muhammad Raihan
    Budi, Eko Mursito
    Sutisna, Nana
    Mulyawan, Rahmat
    Adiono, Trio
    Ikeda, Makoto
    2024 IEEE THE 20TH ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS, APCCAS 2024, 2024, : 504 - 508
  • [43] HBP: Hierarchically Balanced Pruning and Accelerator Co-Design for Efficient DNN Inference
    Ren, Ao
    Wang, Yuhao
    Zhang, Tao
    Shi, Jiaxing
    Liu, Duo
    Chen, Xianzhang
    Tan, Yujuan
    Xie, Yuan
    2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC, 2023,
  • [44] An SoC based HW/SW co-design architecture for multi-standard audio decoding
    Zhou, Dajiang
    Liu, Peilin
    Kong, Ji
    Zhang, Yunfei
    He, Bin
    Deng, Ning
    2007 IEEE ASIAN SOLID-STATE CIRCUITS CONFERENCE, PROCEEDINGS OF TECHNICAL PAPERS, 2007, : 200 - 203
  • [45] Sparse-YOLO: Hardware/Software Co-Design of an FPGA Accelerator for YOLOv2
    Wang, Zixiao
    Xu, Ke
    Wu, Shuaixiao
    Liu, Li
    Liu, Lingzhi
    Wang, Dong
    IEEE ACCESS, 2020, 8 : 116569 - 116585
  • [46] Energy-Efficient 360-Degree Video Rendering on FPGA via Algorithm-Architecture Co-Design
    Sun, Qiuyue
    Taherin, Amir
    Siatitse, Yawo
    Zhu, Yuhao
    2020 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS (FPGA '20), 2020, : 97 - 103
  • [47] Efficient architecture and implementation of vector median filter in co-design context
    Boudabous, Anis
    Khriji, Lazhar
    Ben Atitallah, A.
    Kadionik, P.
    Masmoudi, Nouri
    RADIOENGINEERING, 2007, 16 (03) : 113 - 119
  • [48] Multi-clusters: An Efficient Design Paradigm of NN Accelerator Architecture Based on FPGA
    Wang, Teng
    Gong, Lei
    Wang, Chao
    Yang, Yang
    Gao, Yingxue
    NETWORK AND PARALLEL COMPUTING, NPC 2022, 2022, 13615 : 143 - 154
  • [49] Model-Architecture Co-Design for High Performance Temporal GNN Inference on FPGA
    Zhou, Hongkuan
    Zhang, Bingyi
    Kannan, Rajgopal
    Prasanna, Viktor
    Busart, Carl
    2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2022), 2022, : 1108 - 1117
  • [50] A System Architecture Exploration on the Configurable HW/SW Co-design for H.264 Video Decoder
    Jian, Guo-An
    Chu, Jui-Chin
    Huang, Ting-Yu
    Chang, Tao-Cheng
    Gun, Jiun-In
    ISCAS: 2009 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-5, 2009, : 2237 - 2240