End-to-end acceleration of the YOLO object detection framework on FPGA-only devices

被引：5

作者：

Zhang, Dezheng ^{[1
,2
]}

Wang, Aibin ^{[1
,2
]}

Mo, Ruchan ^{[1
,2
]}

Wang, Dong ^{[1
,2
]}

机构：

[1] Beijing Jiaotong Univ, Inst Informat Sci, Beijing 100044, Peoples R China

[2] Network Technol, Beijing Key Lab Adv Informat Sci, Beijing 100044, Peoples R China

来源：

NEURAL COMPUTING & APPLICATIONS | 2024年 / 36卷 / 03期

基金：

北京市自然科学基金;

关键词：

Convolution neural networks (CNN); Object detection; YOLOv2; Field-programmable gate array (FPGA); High-level synthesis (HLS); Accelerator architecture; Post-processing; CNN;

D O I：

10.1007/s00521-023-09078-8

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Object detection has been revolutionized by convolutional neural networks (CNNs), but their high computational complexity and heavy data access requirements make implementing these algorithms on edge devices challenging. To address this issue, we propose an efficient object detection accelerator for YOLO series algorithm. Our architecture utilizes multiple dimensions of parallelism to accelerate the convolution computation. We employ line-buffer-based parallel data caches and dedicated data access units to minimize off-chip bandwidth pressure. Additionally, our proposed design not only accelerates the convolutional computation, but also control-intensive post-processing to achieve low detection latency. We evaluate the final design on Xilinx V7-690t FPGA device, achieving a throughput of 525 GOP/s for a batch size of 1 and 914 GOP/s for a batch size equal to 2. Compared with state-of-the-art YOLOv2 and YOLOv3 implementations, our proposed accelerator offers up to 9x throughput improvement and 5x shorter latency.

引用

页码：1067 / 1089

页数：23

共 50 条

[1] End-to-end acceleration of the YOLO object detection framework on FPGA-only devices
Dezheng Zhang
Aibin Wang
Ruchan Mo
Dong Wang
Neural Computing and Applications, 2024, 36 : 1067 - 1089
[2] Watch Only Once: An End-to-End Video Action Detection Framework
Chen, Shoufa
Sun, Peize
Xie, Enze
Ge, Chongjian
Wu, Jiannan
Ma, Lan
Shen, Jiajun
Luo, Ping
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 8158 - 8167
[3] Sparse R-CNN: An End-to-End Framework for Object Detection
Sun, Peize
Zhang, Rufeng
Jiang, Yi
Kong, Tao
Xu, Chenfeng
Zhan, Wei
Tomizuka, Masayoshi
Yuan, Zehuan
Luo, Ping
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 15650 - 15664
[4] End-to-End Object Detection with YOLOF
Xi, Xing
Huang, Yangyang
Wu, Weiye
Luo, Ronghua
ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT VII, ICIC 2024, 2024, 14868 : 101 - 112
[5] Enhanced Sparse Detection for End-to-End Object Detection
Liao, Yongwei
Chen, Gang
Xu, Runnan
IEEE ACCESS, 2022, 10 : 85630 - 85640
[6] EOOD: End-to-end oriented object detection
Zhang, Caiguang
Chen, Zilong
Xiong, Boli
Ji, Kefeng
Kuang, Gangyao
NEUROCOMPUTING, 2025, 621
[7] Intrinsic Explainability for End-to-End Object Detection
Fernandes, Luis
Fernandes, Joao N. D.
Calado, Mariana
Pinto, Joao Ribeiro
Cerqueira, Ricardo
Cardoso, Jaime S.
IEEE ACCESS, 2024, 12 : 2623 - 2634
[8] What Makes for End-to-End Object Detection?
Sun, Peize
Jiang, Yi
Xie, Enze
Shao, Wenqi
Yuan, Zehuan
Wang, Changhu
Luo, Ping
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[9] FlexiGAN: An End-to-End Solution for FPGA Acceleration of Generative Adversarial Networks
Yazdanbakhsh, Amir
Brzozowski, Michael
Khaleghi, Behnam
Ghodrati, Soroush
Samadi, Kambiz
Kim, Nam Sung
Esmaeilzadeh, Hadi
PROCEEDINGS 26TH IEEE ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2018), 2018, : 65 - 72
[10] FlexCNN: An End-to-end Framework for Composing CNN Accelerators on FPGA
Basalama, Suhail
Sohrabizadeh, Atefeh
Wang, Jie
Guo, Licheng
Cong, Jason
ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2023, 16 (02)

← 1 2 3 4 5 →