Optimizing Deep Learning Acceleration on FPGA for Real-Time and Resource-Efficient Image Classification

被引：2

作者：

Khaki, Ahmad Mouri Zadeh ^{[1
]}

Choi, Ahyoung ^{[1
]}

机构：

[1] Gachon Univ, Dept AI & Software, Seongnam Si 13120, South Korea

来源：

APPLIED SCIENCES-BASEL | 2025年 / 15卷 / 01期

关键词：

AI hardware acceleration; convolutional neural network (CNN); deep learning; field-programmable gate array (FPGA); transfer learning; TO-DIGITAL CONVERTER; DESIGN; IMPLEMENTATION; EYE; CNN;

D O I：

10.3390/app15010422

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Deep learning (DL) has revolutionized image classification, yet deploying convolutional neural networks (CNNs) on edge devices for real-time applications remains a significant challenge due to constraints in computation, memory, and power efficiency. This work presents an optimized implementation of VGG16 and VGG19, two widely used CNN architectures, for classifying the CIFAR-10 dataset using transfer learning on field-programmable gate arrays (FPGAs). Utilizing the Xilinx Vitis-AI and TensorFlow2 frameworks, we adapt VGG16 and VGG19 for FPGA deployment through quantization, compression, and hardware-specific optimizations. Our implementation achieves high classification accuracy, with Top-1 accuracy of 89.54% and 87.47% for VGG16 and VGG19, respectively, while delivering significant reductions in inference latency (7.29x and 6.6x compared to CPU-based alternatives). These results highlight the suitability of our approach for resource-efficient, real-time edge applications. Key contributions include a detailed methodology for combining transfer learning with FPGA acceleration, an analysis of hardware resource utilization, and performance benchmarks. This work underscores the potential of FPGA-based solutions to enable scalable, low-latency DL deployments in domains such as autonomous systems, IoT, and mobile devices.

引用

页数：13

共 50 条

[41] Few-Shot Learning on Edge Devices Using CLIP: A Resource-Efficient Approach for Image Classification
Lu, Jin
INFORMATION TECHNOLOGY AND CONTROL, 2024, 53 (03):
[42] Deep learning architecture search for real-time image denoising
Hernandez, Esau A. Hervert
Cao, Yan
Kehtarnavaz, Nasser
REAL-TIME IMAGE PROCESSING AND DEEP LEARNING 2022, 2022, 12102
[43] Real-Time Volumetric Image Guidance Via Deep Learning
Liang, X.
Xing, L.
MEDICAL PHYSICS, 2021, 48 (06)
[44] Real-time image style transformation based on deep learning
Zhang, Xianlin
Luan, Yixin
Li, Xueming
JOURNAL OF ELECTRONIC IMAGING, 2018, 27 (04)
[45] A RESOURCE-EFFICIENT DEEP LEARNING FRAMEWORK FOR LOW-DOSE BRAIN PET IMAGE RECONSTRUCTION AND ANALYSIS
Fu, Yu
Dong, Shunjie
Liao, Yi
Xue, Le
Xu, Yuanfan
Li, Feng
Yang, Qianqian
Yu, Tianbai
Tian, Mei
Zhuo, Cheng
2022 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (IEEE ISBI 2022), 2022,
[46] An FPGA Design for Real-Time Image Denoising
Ben Atitallah, Ahmed
COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2022, 43 (02): : 803 - 816
[47] Real-time Learned Image Codec on FPGA
Sun, Heming
Yi, Qingyang
Lin, Fangzheng
Yu, Lu
Katto, Jiro
Fujita, Masahiro
2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2022,
[48] Quantized hashing: enabling resource-efficient deep learning models at the edge
Nazir A.
Mir R.N.
Qureshi S.
International Journal of Information Technology, 2024, 16 (4) : 2353 - 2361
[49] Gating Mechanism in Deep Neural Networks for Resource-Efficient Continual Learning
Jin, Hyundong
Yun, Kimin
Kim, Eunwoo
IEEE ACCESS, 2022, 10 : 18776 - 18786
[50] Time-Domain Coding for Resource-Efficient Deep Neural Networks
Avalos-Legaz, Sergio
Ituero, Pablo
2019 XXXIV CONFERENCE ON DESIGN OF CIRCUITS AND INTEGRATED SYSTEMS (DCIS), 2019,

← 1 2 3 4 5 →