Multi-Modal Object Detection Method Based on Dual-Branch Asymmetric Attention Backbone and Feature Fusion Pyramid Network

被引:1
|
作者
Wang, Jinpeng [1 ]
Su, Nan [1 ,2 ]
Zhao, Chunhui [1 ,2 ]
Yan, Yiming [1 ,2 ]
Feng, Shou [1 ,2 ]
机构
[1] Harbin Engn Univ, Coll Informat & Commun Engn, Harbin 150001, Peoples R China
[2] Harbin Engn Univ, Key Lab Adv Marine Commun & Informat Technol, Minist Ind & Informat Technol, Harbin 150001, Peoples R China
基金
中国国家自然科学基金;
关键词
multi-modal fusion; object detection; asymmetric attention; REMOTE-SENSING IMAGES;
D O I
10.3390/rs16203904
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
With the simultaneous acquisition of the infrared and optical remote sensing images of the same target becoming increasingly easy, using multi-modal data for high-performance object detection has become a research focus. In remote sensing multi-modal data, infrared images lack color information, it is hard to detect difficult targets with low contrast, and optical images are easily affected by illuminance. One of the most effective ways to solve this problem is to integrate multi-modal images for high-performance object detection. The challenge of fusion object detection lies in how to fully integrate multi-modal image features with significant modal differences and avoid introducing interference information while taking advantage of complementary advantages. To solve these problems, a new multi-modal fusion object detection method is proposed. In this paper, the method is improved in terms of two aspects: firstly, a new dual-branch asymmetric attention backbone network (DAAB) is designed, which uses a semantic information supplement module (SISM) and a detail information supplement module (DISM) to supplement and enhance infrared and RGB image information, respectively. Secondly, we propose a feature fusion pyramid network (FFPN), which uses a Transformer-like strategy to carry out multi-modal feature fusion and suppress features that are not conducive to fusion during the fusion process. This method is a state-of-the-art process for both FLIR-aligned and DroneVehicle datasets. Experiments show that this method has strong competitiveness and generalization performance.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Multi-level feature fusion pyramid network for object detection
    Zebin Guo
    Hui Shuai
    Guangcan Liu
    Yisheng Zhu
    Wenqing Wang
    The Visual Computer, 2023, 39 : 4267 - 4277
  • [22] Multi-modal fusion of satellite and street-view images for urban village classification based on a dual-branch deep neural network
    Chen, Boan
    Feng, Quanlong
    Niu, Bowen
    Yan, Fengqin
    Gao, Bingbo
    Yang, Jianyu
    Gong, Jianhua
    Liu, Jiantao
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2022, 109
  • [23] DDFNet-A: Attention-Based Dual-Branch Feature Decomposition Fusion Network for Infrared and Visible Image Fusion
    Wei, Qiancheng
    Liu, Ying
    Jiang, Xiaoping
    Zhang, Ben
    Su, Qiya
    Yu, Muyao
    REMOTE SENSING, 2024, 16 (10)
  • [24] Dual Attention Based Image Pyramid Network for Object Detection
    Dong, Xiang
    Li, Feng
    Bai, Huihui
    Zhao, Yao
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2021, 15 (12): : 4439 - 4455
  • [25] Dual-branch network object detection algorithm based on dual-modality fusion of visible and infrared images
    Hou, Zhiqiang
    Li, Xinyue
    Yang, Chen
    Ma, Sugang
    Yu, Wangsheng
    Wang, Yunchen
    MULTIMEDIA SYSTEMS, 2024, 30 (06)
  • [26] Dual-domain deformable feature fusion for multi-modal 3D object detection
    Wang, Shihao
    Deng, Tao
    JOURNAL OF ELECTRONIC IMAGING, 2024, 33 (06)
  • [27] Video object tracking algorithm based on dual-branch online optimization and feature fusion
    Li, Xinpeng
    Wang, Peng
    Li, Xiaoyan
    Sun, Mengyu
    Chen, Zuntian
    Gao, Hui
    CHINESE JOURNAL OF LIQUID CRYSTALS AND DISPLAYS, 2024, 39 (08)
  • [28] Text Detection on Industrial Barrel Label with Convolutional Attention and Dual-Branch Feature Network
    Wang, Ling
    Zhang, Jing
    Wang, Peng
    Bai, Yane
    IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2025, 20 (04) : 526 - 536
  • [29] Text Detection on Industrial Barrel Label with Convolutional Attention and Dual-Branch Feature Network
    School of Computer Science and Technology, Changchun University of Science and Technology, Changchun, Jilin
    130022, China
    IEEJ Trans. Electr. Electron. Eng.,
  • [30] Attention-based multi-modal fusion sarcasm detection
    Liu, Jing
    Tian, Shengwei
    Yu, Long
    Long, Jun
    Zhou, Tiejun
    Wang, Bo
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 44 (02) : 2097 - 2108