Multi-Modal Object Detection Method Based on Dual-Branch Asymmetric Attention Backbone and Feature Fusion Pyramid Network

被引:1
|
作者
Wang, Jinpeng [1 ]
Su, Nan [1 ,2 ]
Zhao, Chunhui [1 ,2 ]
Yan, Yiming [1 ,2 ]
Feng, Shou [1 ,2 ]
机构
[1] Harbin Engn Univ, Coll Informat & Commun Engn, Harbin 150001, Peoples R China
[2] Harbin Engn Univ, Key Lab Adv Marine Commun & Informat Technol, Minist Ind & Informat Technol, Harbin 150001, Peoples R China
基金
中国国家自然科学基金;
关键词
multi-modal fusion; object detection; asymmetric attention; REMOTE-SENSING IMAGES;
D O I
10.3390/rs16203904
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
With the simultaneous acquisition of the infrared and optical remote sensing images of the same target becoming increasingly easy, using multi-modal data for high-performance object detection has become a research focus. In remote sensing multi-modal data, infrared images lack color information, it is hard to detect difficult targets with low contrast, and optical images are easily affected by illuminance. One of the most effective ways to solve this problem is to integrate multi-modal images for high-performance object detection. The challenge of fusion object detection lies in how to fully integrate multi-modal image features with significant modal differences and avoid introducing interference information while taking advantage of complementary advantages. To solve these problems, a new multi-modal fusion object detection method is proposed. In this paper, the method is improved in terms of two aspects: firstly, a new dual-branch asymmetric attention backbone network (DAAB) is designed, which uses a semantic information supplement module (SISM) and a detail information supplement module (DISM) to supplement and enhance infrared and RGB image information, respectively. Secondly, we propose a feature fusion pyramid network (FFPN), which uses a Transformer-like strategy to carry out multi-modal feature fusion and suppress features that are not conducive to fusion during the fusion process. This method is a state-of-the-art process for both FLIR-aligned and DroneVehicle datasets. Experiments show that this method has strong competitiveness and generalization performance.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Three-Dimensional Object Detection Network Based on Multi-Layer and Multi-Modal Fusion
    Zhu, Wenming
    Zhou, Jia
    Wang, Zizhe
    Zhou, Xuehua
    Zhou, Feng
    Sun, Jingwen
    Song, Mingrui
    Zhou, Zhiguo
    ELECTRONICS, 2024, 13 (17)
  • [42] Dual-branch mutual assistance network for salient object detection
    Yao, Zhaojian
    Wang, Luping
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2022, 37 (01) : 972 - 990
  • [43] Quality-Driven Dual-Branch Feature Integration Network for Video Salient Object Detection
    Zhou, Xiaofei
    Gao, Hanxiao
    Yu, Longxuan
    Yang, Defu
    Zhang, Jiyong
    ELECTRONICS, 2023, 12 (03)
  • [44] DBTrans: A Dual-Branch Vision Transformer for Multi-Modal Brain Tumor Segmentation
    Zeng, Xinyi
    Zeng, Pinxian
    Tang, Cheng
    Wang, Peng
    Yan, Binyu
    Wang, Yan
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT IV, 2023, 14223 : 502 - 512
  • [45] DVFENet: Dual-branch voxel feature extraction network for 3D object detection
    He, Yunqian
    Xia, Guihua
    Luo, Yongkang
    Su, Li
    Zhang, Zhi
    Li, Wanyi
    Wang, Peng
    NEUROCOMPUTING, 2021, 459 : 201 - 211
  • [46] Hyperspectral unmixing method based on dual-branch multiscale residual attention network
    Chen, Congping
    Xu, Zhiwei
    Lu, Peng
    Cao, Nuo
    OPTICAL ENGINEERING, 2023, 62 (09)
  • [47] Object Detection Network Based on Feature Fusion and Attention Mechanism
    Zhang, Ying
    Chen, Yimin
    Huang, Chen
    Gao, Mingke
    FUTURE INTERNET, 2019, 11 (01):
  • [48] Multi-level and Multi-modal Target Detection Based on Feature Fusion
    Cheng T.
    Sun L.
    Hou D.
    Shi Q.
    Zhang J.
    Chen J.
    Huang H.
    Qiche Gongcheng/Automotive Engineering, 2021, 43 (11): : 1602 - 1610
  • [49] A dual-branch multi-feature deep fusion network framework for hyperspectral image classification
    Liu, Linfeng
    Zhang, Chengcai
    Luo, Weiran
    GEOCARTO INTERNATIONAL, 2022, 37 (27) : 18692 - 18715
  • [50] A robot grasping detection network based on flexible selection of multi-modal feature fusion structure
    Wang, Yuhan
    Guo, Zhibo
    Chen, Yu
    Guo, Chaiqi
    Xia, Meizhen
    Qi, Tingyue
    APPLIED INTELLIGENCE, 2024, 54 (06) : 5044 - 5061