Toward Efficient Co-Design of CNN Quantization and HW Architecture on FPGA Hybrid-Accelerator

被引:0
|
作者
Zhang, Yiran [1 ]
Li, Guiying [1 ]
Yuan, Bo [1 ]
机构
[1] Southern Univ Sci & Technol, Guangdong Prov Key Lab Brain Inspired Intelligent, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
CNN accelerator; FPGA; DSE method;
D O I
10.1109/SEDA62518.2024.10617620
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Field programmable gate array (FPGA) has emerged as a promising platform for accelerating convolutional neural networks (CNNs). In this paper, we propose a low-latency CNN hybrid-accelerator system and an efficient design space exploration (DSE) method. Specifically, our targeted FPGA platform consists of different types of accelerators for two advantages: high concurrency and full hardware utilization (i.e., lookup tables (LUTs) and digital signal processors (DSPs)). Besides, we adopt a bandwidth-aware analytical model for system latency to consider pipeline stalls and computation cycles simultaneously. Furthermore, for the huge design space encompassing layer-wise CNN quantization and FPGA hybrid-accelerator architecture, we propose a DSE method (named DiMEGA) aimed at enhancing search efficiency, which is a differentiable method embedded by a genetic algorithm. The performance of our CNN hybrid-accelerator system is demonstrated on a PYNQ-Z2 FPGA platform. The experimental results show that the system latency can be reduced by 42% similar to 48% without sacrificing accuracy, and the DSE time of DiMEGA is reduced by 23% on ResNet20-CIFAR10, and 63% on ResNet56-CIFAR10, compared with SOTA.
引用
收藏
页码:678 / 683
页数:6
相关论文
共 50 条
  • [31] CoNAX: Towards Comprehensive Co-Design Neural Architecture Search Using HW Abstractions
    Braatz, Yannick
    Soliman, Taha
    Rai, Shubham
    Rieber, Dennis Sebastian
    Bringmann, Oliver
    2024 IEEE 35TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS, ASAP 2024, 2024, : 8 - 16
  • [32] HW/SW co-design on embedded SoC FPGA for star tracking optimization in space applications
    Vasileios Panousopoulos
    Emmanouil Papaloukas
    Vasileios Leon
    Dimitrios Soudris
    Emmanuel Koumandakis
    George Lentaris
    Journal of Real-Time Image Processing, 2024, 21
  • [33] FADEC: FPGA-based Acceleration of Video Depth Estimation by HW/SW Co-design
    Hashimoto, Nobuho
    Takamaeda-Yamazaki, Shinya
    2022 21ST INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (ICFPT 2022), 2022, : 103 - 111
  • [34] Exploring HW/SW Co-Design for Video Analysis on CPU-FPGA Heterogeneous Systems
    Zhang, Xiaofan
    Ma, Yuan
    Xiong, Jinjun
    Hwu, Wen-Mei W.
    Kindratenko, Volodymyr
    Chen, Deming
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (06) : 1606 - 1619
  • [35] HW/SW co-design on embedded SoC FPGA for star tracking optimization in space applications
    Panousopoulos, Vasileios
    Papaloukas, Emmanouil
    Leon, Vasileios
    Soudris, Dimitrios
    Koumandakis, Emmanuel
    Lentaris, George
    JOURNAL OF REAL-TIME IMAGE PROCESSING, 2024, 21 (01)
  • [36] SCORCH: Neural Architecture Search and Hardware Accelerator Co-design with Reinforcement Learning
    Liu, Siqin
    Karanth, Avinash
    2024 25TH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN, ISQED 2024, 2024,
  • [37] Deep CNN Co-design for HEVC CU Partition Prediction on FPGA-SoC
    Bouaafia, Soulef
    Khemiri, Randa
    Messaoud, Seifeddine
    Sayadi, Fatma Ezahra
    NEURAL PROCESSING LETTERS, 2022, 54 (04) : 3283 - 3301
  • [38] A Memory-Efficient CNN Accelerator Using Segmented Logarithmic Quantization and Multi-Cluster Architecture
    Xu, Jiawei
    Huan, Yuxiang
    Huang, Boming
    Chu, Haoming
    Jin, Yi
    Zheng, Li-Rong
    Zou, Zhuo
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2021, 68 (06) : 2142 - 2146
  • [39] Architecture and Application Co-Design for Beyond-FPGA Reconfigurable Acceleration Devices
    Boutros, Andrew
    Nurvitadhi, Eriko
    Betz, Vaughn
    IEEE ACCESS, 2022, 10 : 95067 - 95082
  • [40] High Efficient HW/SW Co-design Scheme for Memory Access of Video Decoder
    Wu, Ming
    Guo, Jun
    Zhang, Chuang
    2ND INTERNATIONAL SYMPOSIUM ON COMPUTER NETWORK AND MULTIMEDIA TECHNOLOGY (CNMT 2010), VOLS 1 AND 2, 2010, : 426 - 429