Toward Efficient Co-Design of CNN Quantization and HW Architecture on FPGA Hybrid-Accelerator

被引：0

作者：

Zhang, Yiran ^{[1
]}

Li, Guiying ^{[1
]}

Yuan, Bo ^{[1
]}

机构：

[1] Southern Univ Sci & Technol, Guangdong Prov Key Lab Brain Inspired Intelligent, Shenzhen, Peoples R China

来源：

2024 INTERNATIONAL SYMPOSIUM OF ELECTRONICS DESIGN AUTOMATION, ISEDA 2024 | 2024年

基金：

中国国家自然科学基金;

关键词：

CNN accelerator; FPGA; DSE method;

D O I：

10.1109/SEDA62518.2024.10617620

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Field programmable gate array (FPGA) has emerged as a promising platform for accelerating convolutional neural networks (CNNs). In this paper, we propose a low-latency CNN hybrid-accelerator system and an efficient design space exploration (DSE) method. Specifically, our targeted FPGA platform consists of different types of accelerators for two advantages: high concurrency and full hardware utilization (i.e., lookup tables (LUTs) and digital signal processors (DSPs)). Besides, we adopt a bandwidth-aware analytical model for system latency to consider pipeline stalls and computation cycles simultaneously. Furthermore, for the huge design space encompassing layer-wise CNN quantization and FPGA hybrid-accelerator architecture, we propose a DSE method (named DiMEGA) aimed at enhancing search efficiency, which is a differentiable method embedded by a genetic algorithm. The performance of our CNN hybrid-accelerator system is demonstrated on a PYNQ-Z2 FPGA platform. The experimental results show that the system latency can be reduced by 42% similar to 48% without sacrificing accuracy, and the DSE time of DiMEGA is reduced by 23% on ResNet20-CIFAR10, and 63% on ResNet56-CIFAR10, compared with SOTA.

引用

页码：678 / 683

页数：6

共 50 条

[31] CoNAX: Towards Comprehensive Co-Design Neural Architecture Search Using HW Abstractions
Braatz, Yannick
Soliman, Taha
Rai, Shubham
Rieber, Dennis Sebastian
Bringmann, Oliver
2024 IEEE 35TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS, ASAP 2024, 2024, : 8 - 16
[32] HW/SW co-design on embedded SoC FPGA for star tracking optimization in space applications
Vasileios Panousopoulos
Emmanouil Papaloukas
Vasileios Leon
Dimitrios Soudris
Emmanuel Koumandakis
George Lentaris
Journal of Real-Time Image Processing, 2024, 21
[33] FADEC: FPGA-based Acceleration of Video Depth Estimation by HW/SW Co-design
Hashimoto, Nobuho
Takamaeda-Yamazaki, Shinya
2022 21ST INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (ICFPT 2022), 2022, : 103 - 111
[34] Exploring HW/SW Co-Design for Video Analysis on CPU-FPGA Heterogeneous Systems
Zhang, Xiaofan
Ma, Yuan
Xiong, Jinjun
Hwu, Wen-Mei W.
Kindratenko, Volodymyr
Chen, Deming
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (06) : 1606 - 1619
[35] HW/SW co-design on embedded SoC FPGA for star tracking optimization in space applications
Panousopoulos, Vasileios
Papaloukas, Emmanouil
Leon, Vasileios
Soudris, Dimitrios
Koumandakis, Emmanuel
Lentaris, George
JOURNAL OF REAL-TIME IMAGE PROCESSING, 2024, 21 (01)
[36] SCORCH: Neural Architecture Search and Hardware Accelerator Co-design with Reinforcement Learning
Liu, Siqin
Karanth, Avinash
2024 25TH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN, ISQED 2024, 2024,
[37] Deep CNN Co-design for HEVC CU Partition Prediction on FPGA-SoC
Bouaafia, Soulef
Khemiri, Randa
Messaoud, Seifeddine
Sayadi, Fatma Ezahra
NEURAL PROCESSING LETTERS, 2022, 54 (04) : 3283 - 3301
[38] A Memory-Efficient CNN Accelerator Using Segmented Logarithmic Quantization and Multi-Cluster Architecture
Xu, Jiawei
Huan, Yuxiang
Huang, Boming
Chu, Haoming
Jin, Yi
Zheng, Li-Rong
Zou, Zhuo
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2021, 68 (06) : 2142 - 2146
[39] Architecture and Application Co-Design for Beyond-FPGA Reconfigurable Acceleration Devices
Boutros, Andrew
Nurvitadhi, Eriko
Betz, Vaughn
IEEE ACCESS, 2022, 10 : 95067 - 95082
[40] High Efficient HW/SW Co-design Scheme for Memory Access of Video Decoder
Wu, Ming
Guo, Jun
Zhang, Chuang
2ND INTERNATIONAL SYMPOSIUM ON COMPUTER NETWORK AND MULTIMEDIA TECHNOLOGY (CNMT 2010), VOLS 1 AND 2, 2010, : 426 - 429

← 1 2 3 4 5 →