End-to-End Object Detection with YOLOF

被引：0

作者：

Xi, Xing ^{[1
]}

Huang, Yangyang ^{[1
]}

Wu, Weiye ^{[1
]}

Luo, Ronghua ^{[1
]}

机构：

[1] South China Univ Technol, Guangzhou, Peoples R China

来源：

ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT VII, ICIC 2024 | 2024年 / 14868卷

关键词：

YOLOF; End-to-end Detector; Non-Maximum Suppression; Object Detection;

D O I：

10.1007/978-981-97-5600-1_9

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Within the field of computer vision, object detection is a core issue. A technique extensively utilized in convolution-oriented detectors is Non-Maximum Suppression (NMS), designed to suppress redundant predictions. However, the sequential nature intrinsic to NMS inhibits its capacity for parallel execution, consequently restricting the inference speed. Furthermore, the recall rate of detectors with NMS is also affected in scenes with high object density and overlap. In this paper, we propose a real-time and end-to-end detector with YOLOF (You Only Look One-level Feature). The proposed methods do not introduce additional parameters or attention mechanisms, making them practical for real-time applications. Specifically, we propose the stop-gradient strategy to train only a portion of parameters to address the problem of weak supervision in one-to-one label assignment. We also present auxiliary losses to strengthen the supervision of negative samples during training and use semantic anchor optimization to suppress other anchors in the same location. These techniques allow the improved YOLOF to discard NMS within a 1 mAP gap and achieve faster inference speed. Our YOLOF-CSP-D53-DC5 achieves 42.7 mAP, only 0.5 mAP lower than the original version. Additionally, our YOLOF-R50 achieves a 37.1 mAP at 38 FPS and exceeds state-of-the-art networks by more than 1.5 times in inference speed.

引用

页码：101 / 112

页数：12

共 50 条

[1] Enhanced Sparse Detection for End-to-End Object Detection
Liao, Yongwei
Chen, Gang
Xu, Runnan
[J]. IEEE ACCESS, 2022, 10 : 85630 - 85640
[2] Intrinsic Explainability for End-to-End Object Detection
Fernandes, Luis
Fernandes, Joao N. D.
Calado, Mariana
Pinto, Joao Ribeiro
Cerqueira, Ricardo
Cardoso, Jaime S.
[J]. IEEE ACCESS, 2024, 12 : 2623 - 2634
[3] What Makes for End-to-End Object Detection?
Sun, Peize
Jiang, Yi
Xie, Enze
Shao, Wenqi
Yuan, Zehuan
Wang, Changhu
Luo, Ping
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[4] End-to-End Object Detection with Fully Convolutional Network
Wang, Jianfeng
Song, Lin
Li, Zeming
Sun, Hongbin
Sun, Jian
Zheng, Nanning
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 15844 - 15853
[5] SRDD: a lightweight end-to-end object detection with transformer
Zhu, Yuan
Xia, Qingyuan
Jin, Wen
[J]. CONNECTION SCIENCE, 2022, 34 (01) : 2448 - 2465
[6] Progressive End-to-End Object Detection in Crowded Scenes
Zheng, Anlin
Zhang, Yuang
Zhang, Xiangyu
Qi, Xiaojuan
Sun, Jian
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 847 - 856
[7] Toward End-to-End Object Detection and Tracking on the Edge
Tabkhi, Hamed
[J]. SEC 2017: 2017 THE SECOND ACM/IEEE SYMPOSIUM ON EDGE COMPUTING (SEC'17), 2017,
[8] Dense Distinct Query for End-to-End Object Detection
Zhang, Shilong
Wang, Xinjiang
Wang, Jiaqi
Pang, Jiangmiao
Lyu, Chengqi
Zhang, Wenwei
Luo, Ping
Chen, Kai
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 7329 - 7338
[9] End-to-End Edge Neuromorphic Object Detection System
Silva, D. A.
Shymyrbay, A.
Smagulova, K.
Elsheikh, A.
Fouda, M. E.
Eltawil, A. M.
[J]. 2024 IEEE 6TH INTERNATIONAL CONFERENCE ON AI CIRCUITS AND SYSTEMS, AICAS 2024, 2024, : 194 - 198
[10] End-to-End Human Object Interaction Detection with HOI Transformer
Zou, Cheng
Wang, Bohan
Hu, Yue
Liu, Junqi
Wu, Qian
Zhao, Yu
Li, Boxun
Zhang, Chenguang
Zhang, Chi
Wei, Yichen
Sun, Jian
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 11820 - 11829

← 1 2 3 4 5 →