Matching Multi-Scale Features and Prediction Tasks for Real-Time Object Detection

被引：0

作者：

Du Hongjie ^{[1
]}

Sun Hanqing ^{[1
]}

Cao Jiale ^{[1
]}

Pang Yanwei ^{[1
]}

机构：

[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China

来源：

LASER & OPTOELECTRONICS PROGRESS | 2021年 / 58卷 / 12期

关键词：

image processing; real-time object detection; convolutional neural network; multi-scale feature; match;

D O I：

10.3788/LOP202158.1210014

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In object detection algorithms based on convolutional neural networks, high-resolution features from lower levels contain more detailed information, which can help the abstract features complete the accurate positioning task; deep-level features contain abstract semantic information, which is more suitable for target existence prediction task. When the most existing anchor-free detection method directly predicts all tasks on the same feature map, it does not match the above features and prediction tasks, which limits the detection accuracy. To this end, the MFT detector, a real-time object detection algorithm, is proposed to match multi-scale features and prediction tasks of targets. MFT detector is based on CenterNet detector, which can match shallow detail features with accurate positioning task, and match multi-scale, multi receptive field abstract features with target existence prediction task. Experimental results show that the proposed MFT detector alleviates the mismatch between features and prediction tasks, and significantly improves the detection precision while maintaining a high speed of 94.5 frame/s, which meets the requirement of a real-time vision system.

引用

页数：10

共 30 条

[1] [Anonymous], 2016, 2016 IEEE C COMPUTER, P779
[2] Hierarchical Shot Detector
Cao, Jiale
Pang, Yanwei
Han, Jungong
Li, Xuelong
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9704 - 9713
[3] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
Chen, Liang-Chieh
Papandreou, George
Kokkinos, Iasonas
Murphy, Kevin
Yuille, Alan L.
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
[4] The PASCAL Visual Object Classes Challenge: A Retrospective
Everingham, Mark
Eslami, S. M. Ali
Van Gool, Luc
Williams, Christopher K. I.
Winn, John
Zisserman, Andrew
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 111 (01) : 98 - 136
[5] Glorot X, 2010, P 13 INT C ART INT S, P249, DOI DOI 10.1109/LGRS.2016.2565705
[6] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[7] He KM, 2014, LECT NOTES COMPUT SC, V8691, P346, DOI [arXiv:1406.4729, 10.1007/978-3-319-10578-9_23]
[8] Joseph R., 2018, YOLOV3 INCREMENTAL I, DOI DOI 10.48550/ARXIV.1804.02767
[9] Multi-Scale Target Detection Algorithm Based on Attention Mechanism
Ju Moran
Luo Jiangning
Wang Zhongbo
Luo Haibo
[J]. ACTA OPTICA SINICA, 2020, 40 (13)
[10] KingmaD P, 2020, ADAM METHOD STOCHAST

← 1 2 3 →