Bridging Search Region Interaction with Template for RGB-T Tracking

被引:44
|
作者
Hui, Tianrui [1 ,2 ]
Xun, Zizheng [3 ,5 ]
Peng, Fengguang [3 ,5 ]
Huang, Junshi [4 ]
Wei, Xiaoming [4 ]
Wei, Xiaolin [4 ]
Dai, Jiao [1 ,2 ]
Han, Jizhong [1 ,2 ]
Liu, Si [3 ,5 ]
机构
[1] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Cyber Secur, Beijing, Peoples R China
[3] Beihang Univ, Inst Artificial Intelligence, Beijing, Peoples R China
[4] Meituan, Beijing, Peoples R China
[5] Beihang Univ, Hangzhou Innovat Inst, Hangzhou, Peoples R China
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR52729.2023.01310
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
RGB-T tracking aims to leverage the mutual enhancement and complement ability of RGB and TIR modalities for improving the tracking process in various scenarios, where cross-modal interaction is the key component. Some previous methods concatenate the RGB and TIR search region features directly to perform a coarse interaction process with redundant background noises introduced. Many other methods sample candidate boxes from search frames and conduct various fusion approaches on isolated pairs of RGB and TIR boxes, which limits the cross-modal interaction within local regions and brings about inadequate context modeling. To alleviate these limitations, we propose a novel Template-Bridged Search region Interaction (TBSI) module which exploits templates as the medium to bridge the cross-modal interaction between RGB and TIR search regions by gathering and distributing target-relevant object and environment contexts. Original templates are also updated with enriched multimodal contexts from the template medium. Our TBSI module is inserted into a ViT backbone for joint feature extraction, search-template matching, and cross-modal interaction. Extensive experiments on three popular RGB-T tracking benchmarks demonstrate our method achieves new state-of-the-art performances. Code is available at https://github.com/RyanHTR/TBSI.
引用
收藏
页码:13630 / 13639
页数:10
相关论文
共 50 条
  • [21] AMNet: Learning to Align Multi-Modality for RGB-T Tracking
    Zhang, Tianlu
    He, Xiaoyi
    Jiao, Qiang
    Zhang, Qiang
    Han, Jungong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (08) : 7386 - 7400
  • [22] Efficient RGB-T Tracking via Cross-Modality Distillation
    Zhang, Tianlu
    Guo, Hongyuan
    Jiao, Qiang
    Zhang, Qiang
    Han, Jungong
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 5404 - 5413
  • [23] Jointly Modeling Motion and Appearance Cues for Robust RGB-T Tracking
    Zhang, Pengyu
    Zhao, Jie
    Bo, Chunjuan
    Wang, Dong
    Lu, Huchuan
    Yang, Xiaoyun
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 3335 - 3347
  • [24] Siamese infrared and visible light fusion network for RGB-T tracking
    Peng, Jingchao
    Zhao, Haitao
    Hu, Zhengwei
    Zhuang, Yi
    Wang, Bofan
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (09) : 3281 - 3293
  • [25] Enhanced Real-Time RGB-T Tracking by Complementary Learners
    Xu, Qingyu
    Kuai, Yangliu
    Yang, Junggang
    Deng, Xinpu
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2021, 30 (10)
  • [26] Cross-Modal Pattern-Propagation for RGB-T Tracking
    Wang, Chaoqun
    Xu, Chunyan
    Cui, Zhen
    Zhou, Ling
    Zhang, Tong
    Zhang, Xiaoya
    Yang, Jian
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 7062 - 7071
  • [27] A Survey of RGB-T Object Tracking Technologies Based on Deep Learning
    Zhang, Tianlu
    Zhang, Qiang
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2023, 36 (04): : 327 - 353
  • [28] RGB-T Saliency Detection Based on Multiscale Modal Reasoning Interaction
    Wu, Yunhe
    Jia, Tong
    Chang, Xingya
    Wang, Hao
    Chen, Dongyue
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73
  • [29] Context-Aware Interaction Network for RGB-T Semantic Segmentation
    Lv, Ying
    Liu, Zhi
    Li, Gongyang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 6348 - 6360
  • [30] Anchor free based Siamese network tracker with transformer for RGB-T tracking
    Fan, Liangsong
    Kim, Pyeoungkee
    SCIENTIFIC REPORTS, 2023, 13 (01)