Learning Modality Complementary Features with Mixed Attention Mechanism for RGB-T Tracking

被引:8
|
作者
Luo, Yang [1 ,2 ]
Guo, Xiqing [1 ,2 ]
Dong, Mingtao [3 ]
Yu, Jin [1 ,2 ]
机构
[1] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing 100094, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100040, Peoples R China
[3] Northeastern Univ, Inst Image Recognit & Machine Intelligence, Shenyang 110167, Peoples R China
关键词
multi-modality adaptive fusion; mixed-attention mechanism; RGB-T tracking; NETWORK;
D O I
10.3390/s23146609
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
RGB-T tracking involves the use of images from both visible and thermal modalities. The primary objective is to adaptively leverage the relatively dominant modality in varying conditions to achieve more robust tracking compared to single-modality tracking. An RGB-T tracker based on a mixed-attention mechanism to achieve a complementary fusion of modalities (referred to as MACFT) is proposed in this paper. In the feature extraction stage, we utilize different transformer backbone branches to extract specific and shared information from different modalities. By performing mixed-attention operations in the backbone to enable information interaction and self-enhancement between the template and search images, a robust feature representation is constructed that better understands the high-level semantic features of the target. Then, in the feature fusion stage, a modality shared-specific feature interaction structure was designed based on a mixed-attention mechanism, effectively suppressing low-quality modality noise while enhancing the information from the dominant modality. Evaluation on multiple RGB-T public datasets demonstrates that our proposed tracker outperforms other RGB-T trackers on general evaluation metrics while also being able to adapt to long-term tracking scenarios.
引用
收藏
页数:19
相关论文
共 50 条
  • [41] Semantic-guided fusion for multiple object tracking and RGB-T tracking
    Liu, Xiaohu
    Luo, Yichuang
    Zhang, Yan
    Lei, Zhiyong
    IET IMAGE PROCESSING, 2023, 17 (11) : 3281 - 3291
  • [42] Learning Adaptive Attribute-Driven Representation for Real-Time RGB-T Tracking
    Zhang, Pengyu
    Wang, Dong
    Lu, Huchuan
    Yang, Xiaoyun
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (09) : 2714 - 2729
  • [43] CAGNet: Coordinated attention guidance network for RGB-T crowd counting
    Yang, Xun
    Zhou, Wujie
    Yan, Weiqing
    Qian, Xiaohong
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 243
  • [44] ROBUST RGB-T TRACKING VIA CONSISTENCY REGULATED SCENE PERCEPTION
    Kang, Bin
    Liu, Liwei
    Zhao, Shihao
    Du, Songlin
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 510 - 514
  • [45] Siamese infrared and visible light fusion network for RGB-T tracking
    Jingchao Peng
    Haitao Zhao
    Zhengwei Hu
    Yi Zhuang
    Bofan Wang
    International Journal of Machine Learning and Cybernetics, 2023, 14 : 3281 - 3293
  • [46] Jointly Modeling Motion and Appearance Cues for Robust RGB-T Tracking
    Zhang, Pengyu
    Zhao, Jie
    Bo, Chunjuan
    Wang, Dong
    Lu, Huchuan
    Yang, Xiaoyun
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 3335 - 3347
  • [47] SwapTrack: Enhancing RGB-T Tracking via Learning from Paired and Single-Modal Data
    Xie, Jianyu
    Zeng, Zhuo
    Yang, Zhijie
    Zhou, Junlin
    Bai, Di
    Chen, Duanbing
    2024 2ND ASIA CONFERENCE ON COMPUTER VISION, IMAGE PROCESSING AND PATTERN RECOGNITION, CVIPPR 2024, 2024,
  • [48] Siamese infrared and visible light fusion network for RGB-T tracking
    Peng, Jingchao
    Zhao, Haitao
    Hu, Zhengwei
    Zhuang, Yi
    Wang, Bofan
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (09) : 3281 - 3293
  • [49] Cross-Modal Pattern-Propagation for RGB-T Tracking
    Wang, Chaoqun
    Xu, Chunyan
    Cui, Zhen
    Zhou, Ling
    Zhang, Tong
    Zhang, Xiaoya
    Yang, Jian
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 7062 - 7071
  • [50] Learning discriminative update adaptive spatial-temporal regularized correlation filter for RGB-T tracking
    Feng, Mingzheng
    Song, Kechen
    Wang, Yanyan
    Liu, Jie
    Yan, Yunhui
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2020, 72