Optimizing Deep Learning Acceleration on FPGA for Real-Time and Resource-Efficient Image Classification

被引:2
|
作者
Khaki, Ahmad Mouri Zadeh [1 ]
Choi, Ahyoung [1 ]
机构
[1] Gachon Univ, Dept AI & Software, Seongnam Si 13120, South Korea
来源
APPLIED SCIENCES-BASEL | 2025年 / 15卷 / 01期
关键词
AI hardware acceleration; convolutional neural network (CNN); deep learning; field-programmable gate array (FPGA); transfer learning; TO-DIGITAL CONVERTER; DESIGN; IMPLEMENTATION; EYE; CNN;
D O I
10.3390/app15010422
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Deep learning (DL) has revolutionized image classification, yet deploying convolutional neural networks (CNNs) on edge devices for real-time applications remains a significant challenge due to constraints in computation, memory, and power efficiency. This work presents an optimized implementation of VGG16 and VGG19, two widely used CNN architectures, for classifying the CIFAR-10 dataset using transfer learning on field-programmable gate arrays (FPGAs). Utilizing the Xilinx Vitis-AI and TensorFlow2 frameworks, we adapt VGG16 and VGG19 for FPGA deployment through quantization, compression, and hardware-specific optimizations. Our implementation achieves high classification accuracy, with Top-1 accuracy of 89.54% and 87.47% for VGG16 and VGG19, respectively, while delivering significant reductions in inference latency (7.29x and 6.6x compared to CPU-based alternatives). These results highlight the suitability of our approach for resource-efficient, real-time edge applications. Key contributions include a detailed methodology for combining transfer learning with FPGA acceleration, an analysis of hardware resource utilization, and performance benchmarks. This work underscores the potential of FPGA-based solutions to enable scalable, low-latency DL deployments in domains such as autonomous systems, IoT, and mobile devices.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Few-Shot Learning on Edge Devices Using CLIP: A Resource-Efficient Approach for Image Classification
    Lu, Jin
    INFORMATION TECHNOLOGY AND CONTROL, 2024, 53 (03):
  • [42] Deep learning architecture search for real-time image denoising
    Hernandez, Esau A. Hervert
    Cao, Yan
    Kehtarnavaz, Nasser
    REAL-TIME IMAGE PROCESSING AND DEEP LEARNING 2022, 2022, 12102
  • [43] Real-Time Volumetric Image Guidance Via Deep Learning
    Liang, X.
    Xing, L.
    MEDICAL PHYSICS, 2021, 48 (06)
  • [44] Real-time image style transformation based on deep learning
    Zhang, Xianlin
    Luan, Yixin
    Li, Xueming
    JOURNAL OF ELECTRONIC IMAGING, 2018, 27 (04)
  • [45] A RESOURCE-EFFICIENT DEEP LEARNING FRAMEWORK FOR LOW-DOSE BRAIN PET IMAGE RECONSTRUCTION AND ANALYSIS
    Fu, Yu
    Dong, Shunjie
    Liao, Yi
    Xue, Le
    Xu, Yuanfan
    Li, Feng
    Yang, Qianqian
    Yu, Tianbai
    Tian, Mei
    Zhuo, Cheng
    2022 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (IEEE ISBI 2022), 2022,
  • [46] An FPGA Design for Real-Time Image Denoising
    Ben Atitallah, Ahmed
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2022, 43 (02): : 803 - 816
  • [47] Real-time Learned Image Codec on FPGA
    Sun, Heming
    Yi, Qingyang
    Lin, Fangzheng
    Yu, Lu
    Katto, Jiro
    Fujita, Masahiro
    2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2022,
  • [48] Quantized hashing: enabling resource-efficient deep learning models at the edge
    Nazir A.
    Mir R.N.
    Qureshi S.
    International Journal of Information Technology, 2024, 16 (4) : 2353 - 2361
  • [49] Gating Mechanism in Deep Neural Networks for Resource-Efficient Continual Learning
    Jin, Hyundong
    Yun, Kimin
    Kim, Eunwoo
    IEEE ACCESS, 2022, 10 : 18776 - 18786
  • [50] Time-Domain Coding for Resource-Efficient Deep Neural Networks
    Avalos-Legaz, Sergio
    Ituero, Pablo
    2019 XXXIV CONFERENCE ON DESIGN OF CIRCUITS AND INTEGRATED SYSTEMS (DCIS), 2019,