Adaptive feature fusion with attention mechanism for multi-scale target detection

被引：0

作者：

Moran Ju

Jiangning Luo

Zhongbo Wang

Haibo Luo

机构：

[1] Chinese Academy of Sciences,Shenyang Institute of Automation

[2] Chinese Academy of Sciences,Institutes for Robotics and Intelligent Manufacturing

[3] University of Chinese Academy of Sciences,Key Laboratory of Opt

[4] Chinese Academy of Sciences,Electronic Information Processing

[5] The Key Laboratory of Image Understanding and Computer Vision,undefined

[6] McGill University,undefined

来源：

Neural Computing and Applications | 2021年 / 33卷

关键词：

Deep learning; Target detection; Adaptive feature fusion; Attention mechanism;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

To detect the targets of different sizes, multi-scale output is used by target detectors such as YOLO V3 and DSSD. To improve the detection performance, YOLO V3 and DSSD perform feature fusion by combining two adjacent scales. However, the feature fusion only between the adjacent scales is not sufficient. It hasn’t made advantage of the features at other scales. What is more, as a common operation for feature fusion, concatenating can’t provide a mechanism to learn the importance and correlation of the features at different scales. In this paper, we propose adaptive feature fusion with attention mechanism (AFFAM) for multi-scale target detection. AFFAM utilizes pathway layer and subpixel convolution layer to resize the feature maps, which is helpful to learn better and complex feature mapping. In addition, AFFAM utilizes global attention mechanism and spatial position attention mechanism, respectively, to learn the correlation of the channel features and the importance of the spatial features at different scales adaptively. Finally, we combine AFFAM with YOLO V3 to build an efficient multi-scale target detector. The comparative experiments are conducted on PASCAL VOC dataset, KITTI dataset and Smart UVM dataset. Compared with the state-of-the-art target detectors, YOLO V3 with AFFAM achieved 84.34% mean average precision (mAP) at 19.9 FPS on PASCAL VOC dataset, 87.2% mAP at 21 FPS on KITTI dataset and 99.22% mAP at 20.6 FPS on Smart UVM dataset which outperforms other advanced target detectors.

引用

页码：2769 / 2781

页数：12

共 50 条

[1] Adaptive feature fusion with attention mechanism for multi-scale target detection
Ju, Moran
Luo, Jiangning
Wang, Zhongbo
Luo, Haibo
[J]. NEURAL COMPUTING & APPLICATIONS, 2021, 33 (07): : 2769 - 2781
[2] Multi-Scale Feature Fusion Attention Network for Infrared Small Target Detection
Zhang, Yidan
Li, Chunlei
Liu, Yundong
Liu, Zhoufeng
Yang, Ruimin
[J]. FOURTEENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING, ICGIP 2022, 2022, 12705
[3] SSD with multi-scale feature fusion and attention mechanism
Qiang Liu
Lijun Dong
Zhigao Zeng
Wenqiu Zhu
Yanhui Zhu
Chen Meng
[J]. Scientific Reports, 13 (1)
[4] SSD with multi-scale feature fusion and attention mechanism
Liu, Qiang
Dong, Lijun
Zeng, Zhigao
Zhu, Wenqiu
Zhu, Yanhui
Meng, Chen
[J]. SCIENTIFIC REPORTS, 2023, 13 (01):
[5] Residual attention mechanism and weighted feature fusion for multi-scale object detection
Zhang, Jie
Qi, Qiye
Zhang, Huanlong
Du, Qifan
Wang, Fengxian
Shi, Xiaoping
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (26) : 40873 - 40889
[6] Residual attention mechanism and weighted feature fusion for multi-scale object detection
Jie Zhang
Qiye Qi
Huanlong Zhang
Qifan Du
Fengxian Wang
Xiaoping Shi
[J]. Multimedia Tools and Applications, 2023, 82 : 40873 - 40889
[7] Multi-scale feature fusion with attention mechanism for crowded road object detection
Wu, Jingtao
Dai, Guojun
Zhou, Wenhui
Zhu, Xudong
Wang, Zengguan
[J]. JOURNAL OF REAL-TIME IMAGE PROCESSING, 2024, 21 (02)
[8] Multi-scale feature fusion with attention mechanism for crowded road object detection
Jingtao Wu
Guojun Dai
Wenhui Zhou
Xudong Zhu
Zengguan Wang
[J]. Journal of Real-Time Image Processing, 2024, 21
[9] A multi-scale feature fusion target detection algorithm
Dong, Chong
Li, Jingmei
Wang, Jiaxiang
[J]. 2018 INTERNATIONAL CONFERENCE ON IMAGE AND VIDEO PROCESSING, AND ARTIFICIAL INTELLIGENCE, 2018, 10836
[10] MFANet: Multi-scale feature fusion network with attention mechanism
Wang, Gaihua
Gan, Xin
Cao, Qingcheng
Zhai, Qianyu
[J]. VISUAL COMPUTER, 2023, 39 (07): : 2969 - 2980

← 1 2 3 4 5 →