Visual and Language Collaborative Learning for RGBT Object Tracking

被引:0
|
作者
Wang, Jiahao [1 ]
Liu, Fang [1 ]
Jiao, Licheng [1 ]
Gao, Yingjia [1 ]
Wang, Hao [1 ]
Li, Shuo [1 ]
Li, Lingling [1 ]
Chen, Puhua [1 ]
Liu, Xu [1 ]
机构
[1] Xidian University, Key Laboratory of Intelligent Perception and Image Understanding, Ministry of Education, International Research Center for Intelligent Perception and Computation, Joint International Research Laboratory of Intelligent Perception and Comp
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Benchmarking - Clutter (information theory) - Infrared imaging - Job analysis - Object recognition - Target tracking - Timing circuits - Visual languages;
D O I
10.1109/TCSVT.2024.3436878
中图分类号
学科分类号
摘要
Despite the extensive research on RGBT object tracking, there are still several challenges and issues in practical applications, such as modality differences, lighting variations and disappearance of the target, and changes in viewpoint. Existing methods mostly address these issues by fusing image features, while neglecting a significant amount of target label information. To address these challenges, this paper introduces text to drive the alignment of visible and infrared image features, transforming features from different modalities into the same feature space and fully using complementary features between different modalities. Furthermore, inspired by the success of prompt learning in various tasks, we utilize prior boxes and language as prompts to further guide the model in tracking the target. Extensive experiments demonstrate that the proposed VLCTrack tracker has excellent potential in RGBT object tracking. Compared to previous methods developed for this purpose, our approach achieves state-of-the-art performance on three benchmark datasets. © 1991-2012 IEEE.
引用
收藏
页码:12770 / 12781
相关论文
共 50 条
  • [31] Accurate visual representation learning for single object tracking
    Bao, Hua
    Shu, Ping
    Wang, Qijun
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (17) : 24059 - 24079
  • [32] Learning Spatial Fusion and Matching for Visual Object Tracking
    Xiao, Wei
    Zhang, Zili
    PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III, 2022, 13631 : 352 - 367
  • [33] Learning object-uncertainty policy for visual tracking
    He, Xuedong
    Chen, Calvin Yu-Chian
    INFORMATION SCIENCES, 2022, 582 : 60 - 72
  • [34] Online learning of multiple detectors for visual object tracking
    Quan, Wei
    Chen, Jin-Xiong
    Yu, Nan-Yang
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2014, 42 (05): : 875 - 882
  • [35] Visual object tracking by correlation filters and online learning
    Zhang, Xin
    Xia, Gui-Song
    Lu, Qikai
    Shen, Weiming
    Zhang, Liangpei
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2018, 140 : 77 - 89
  • [36] SeqTrack: Sequence to Sequence Learning for Visual Object Tracking
    Chen, Xin
    Peng, Houwen
    Wang, Dong
    Lu, Huchuan
    Hu, Han
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 14572 - 14581
  • [37] Learning Dynamic Siamese Network for Visual Object Tracking
    Guo, Qing
    Feng, Wei
    Zhou, Ce
    Huang, Rui
    Wan, Liang
    Wang, Song
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1781 - 1789
  • [38] Object tracking based on learning collaborative representation with adaptive weight
    Xu, Mengxi
    Lv, Li
    Luan, Hui
    Huang, Chenrong
    Fan, Tanghuai
    SIGNAL IMAGE AND VIDEO PROCESSING, 2020, 14 (02) : 267 - 275
  • [39] Object tracking based on learning collaborative representation with adaptive weight
    Mengxi Xu
    Li Lv
    Hui Luan
    Chenrong Huang
    Tanghuai Fan
    Signal, Image and Video Processing, 2020, 14 : 267 - 275
  • [40] Collaborative Deep Reinforcement Learning for Multi-object Tracking
    Ren, Liangliang
    Lu, Jiwen
    Wang, Zifeng
    Tian, Qi
    Zhou, Jie
    COMPUTER VISION - ECCV 2018, PT III, 2018, 11207 : 605 - 621