Visual and Language Collaborative Learning for RGBT Object Tracking

被引:0
|
作者
Wang, Jiahao [1 ]
Liu, Fang [1 ]
Jiao, Licheng [1 ]
Gao, Yingjia [1 ]
Wang, Hao [1 ]
Li, Shuo [1 ]
Li, Lingling [1 ]
Chen, Puhua [1 ]
Liu, Xu [1 ]
机构
[1] Xidian University, Key Laboratory of Intelligent Perception and Image Understanding, Ministry of Education, International Research Center for Intelligent Perception and Computation, Joint International Research Laboratory of Intelligent Perception and Comp
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Benchmarking - Clutter (information theory) - Infrared imaging - Job analysis - Object recognition - Target tracking - Timing circuits - Visual languages;
D O I
10.1109/TCSVT.2024.3436878
中图分类号
学科分类号
摘要
Despite the extensive research on RGBT object tracking, there are still several challenges and issues in practical applications, such as modality differences, lighting variations and disappearance of the target, and changes in viewpoint. Existing methods mostly address these issues by fusing image features, while neglecting a significant amount of target label information. To address these challenges, this paper introduces text to drive the alignment of visible and infrared image features, transforming features from different modalities into the same feature space and fully using complementary features between different modalities. Furthermore, inspired by the success of prompt learning in various tasks, we utilize prior boxes and language as prompts to further guide the model in tracking the target. Extensive experiments demonstrate that the proposed VLCTrack tracker has excellent potential in RGBT object tracking. Compared to previous methods developed for this purpose, our approach achieves state-of-the-art performance on three benchmark datasets. © 1991-2012 IEEE.
引用
收藏
页码:12770 / 12781
相关论文
共 50 条
  • [21] Visual Learning in Multiple-Object Tracking
    Makovski, Tal
    Vazquez, Gustavo A.
    Jiang, Yuhong V.
    PLOS ONE, 2008, 3 (05):
  • [22] Variable scale learning for visual object tracking
    He, Xuedong
    Zhao, Lu
    Chen, Calvin Yu-Chian
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 14 (4) : 3315 - 3330
  • [23] Learning a multimodal feature transformer for RGBT tracking
    Shi, Huiwei
    Mu, Xiaodong
    Shen, Danyao
    Zhong, Chengliang
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (SUPPL 1) : 239 - 250
  • [24] Single online visual object tracking with enhanced tracking and detection learning
    Yang Yi
    Liping Luo
    Zhenxian Zheng
    Multimedia Tools and Applications, 2019, 78 : 12333 - 12351
  • [25] Single online visual object tracking with enhanced tracking and detection learning
    Yi, Yang
    Luo, Liping
    Zheng, Zhenxian
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (09) : 12333 - 12351
  • [26] Accurate visual representation learning for single object tracking
    Hua Bao
    Ping Shu
    Qijun Wang
    Multimedia Tools and Applications, 2022, 81 : 24059 - 24079
  • [27] Visual Object Tracking via Joint Learning Method
    Tian, Wei
    Lv, Jingyuan
    2014 6TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS, 2014, : 1163 - 1167
  • [28] Online dual dictionary learning for visual object tracking
    Xu Cheng
    Yifeng Zhang
    Lin Zhou
    Guojun Lu
    Journal of Ambient Intelligence and Humanized Computing, 2021, 12 : 10881 - 10896
  • [29] Learning object intrinsic structure for robust visual tracking
    Wang, Q
    Xu, GY
    Ai, HZ
    2003 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL II, PROCEEDINGS, 2003, : 227 - 233
  • [30] Online dual dictionary learning for visual object tracking
    Cheng, Xu
    Zhang, Yifeng
    Zhou, Lin
    Lu, Guojun
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 12 (12) : 10881 - 10896