Modeling of Multiple Spatial-Temporal Relations for Robust Visual Object Tracking

被引:0
|
作者
Wang, Shilei [1 ,2 ]
Wang, Zhenhua [1 ,2 ]
Sun, Qianqian [1 ,2 ]
Cheng, Gong [3 ]
Ning, Jifeng [1 ,2 ]
机构
[1] Northwest A&F Univ, Coll Informat Engn, Yangling 712100, Peoples R China
[2] Shaanxi Engn Res Ctr Agr Informat Intelligent Perc, Yangling 712100, Peoples R China
[3] Northwestern Polytech Univ, Sch Automat, Xian 710072, Peoples R China
基金
中国国家自然科学基金;
关键词
Transformers; Trajectory; Feature extraction; Target tracking; Adaptation models; Computational modeling; Object tracking; Visual object tracking; transformer; spatial-temporal modeling; adaptive update;
D O I
10.1109/TIP.2024.3453028
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, one-stream trackers have achieved parallel feature extraction and relation modeling through the exploitation of Transformer-based architectures. This design greatly improves the performance of trackers. However, as one-stream trackers often overlook crucial tracking cues beyond the template, they prone to give unsatisfactory results against complex tracking scenarios. To tackle these challenges, we propose a multi-cue single-stream tracker, dubbed MCTrack here, which seamlessly integrates template information, historical trajectory, historical frame, and the search region for synchronized feature extraction and relation modeling. To achieve this, we employ two types of encoders to convert the template, historical frames, search region, and historical trajectory into tokens, which are then collectively fed into a Transformer architecture. To distill temporal and spatial cues, we introduce a novel adaptive update mechanism, which incorporates a thresholding component and a local multi-peak component to filter out less accurate and overly disturbed tracking cues. Empirically, MCTrack achieves leading performance on mainstream benchmark datasets, surpassing the most advanced SeqTrack by 2.0% in terms of the AO metric on GOT-10k. The code is available at https://github.com/wsumel/MCTrack.
引用
收藏
页码:5073 / 5085
页数:13
相关论文
共 50 条
  • [1] An improved spatial-temporal regularization method for visual object tracking
    Hayat, Muhammad Umar
    Ali, Ahmad
    Khan, Baber
    Mehmood, Khizer
    Ullah, Khitab
    Amir, Muhammad
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (03) : 2065 - 2077
  • [2] TransMOT: Spatial-Temporal Graph Transformer for Multiple Object Tracking
    Chu, Peng
    Wang, Jiang
    You, Quanzeng
    Ling, Haibin
    Liu, Zicheng
    [J]. 2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 4859 - 4869
  • [3] STDFormer: Spatial-Temporal Motion Transformer for Multiple Object Tracking
    Hu, Mengjie
    Zhu, Xiaotong
    Wang, Haotian
    Cao, Shixiang
    Liu, Chun
    Song, Qing
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (11) : 6571 - 6594
  • [4] Deep Spatial and Temporal Network for Robust Visual Object Tracking
    Teng, Zhu
    Xing, Junliang
    Wang, Qiang
    Zhang, Baopeng
    Fan, Jianping
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 1762 - 1775
  • [5] Dynamic feature fusion with spatial-temporal context for robust object tracking
    Nai, Ke
    Li, Zhiyong
    Wang, Haidong
    [J]. PATTERN RECOGNITION, 2022, 130
  • [6] Visual object tracking by using ranking loss and spatial-temporal features
    Saribas, Hasan
    Cevikalp, Hakan
    Kahvecioglu, Sinem
    [J]. MACHINE VISION AND APPLICATIONS, 2023, 34 (02)
  • [7] SPRTracker: Learning Spatial-Temporal Pixel Aggregations for Multiple Object Tracking
    Liu, Jialin
    Kong, Jun
    Jiang, Min
    Liu, Tianshan
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2732 - 2736
  • [8] MASK GUIDED SPATIAL-TEMPORAL FUSION NETWORK FOR MULTIPLE OBJECT TRACKING
    Zhao, Shuangye
    Wu, Yubin
    Wang, Shuai
    Ke, Wei
    Sheng, Hao
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 3231 - 3235
  • [9] Adaptive Spatial-Temporal Regularization for Correlation Filters Based Visual Object Tracking
    Chen, Fei
    Wang, Xiaodong
    [J]. SYMMETRY-BASEL, 2021, 13 (09):
  • [10] A spatial-temporal contexts network for object tracking
    Huang, Kai
    Xiao, Kai
    Chu, Jun
    Leng, Lu
    Dong, Xingbo
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 127