Joint spatio-temporal modeling for visual tracking

被引：5

作者：

Sun, Yumei ^{[1
,2
,3
,4
,5
]}

Tang, Chuanming ^{[1
,2
,3
,4
,5
]}

Luo, Hui ^{[1
,2
,3
,4
,5
]}

Li, Qingqing ^{[1
,2
,3
,5
]}

Peng, Xiaoming ^{[5
]}

Zhang, Jianlin ^{[1
,2
,3
,4
,5
]}

Li, Meihui ^{[1
,2
,3
,5
]}

Wei, Yuxing ^{[1
,2
,3
,5
]}

机构：

[1] Univ Chinese Acad Sci, Sch Elect Elect & Commun Engn, Beijing 108408, Peoples R China

[2] Chinese Acad Sci, Key Lab Opt Engn, Chengdu 610209, Peoples R China

[3] Chinese Acad Sci, Inst Opt & Elect, Chengdu 610209, Peoples R China

[4] Chinese Acad Sci, Natl Key Lab Opt Field Manipulat Sci & Technol, Chengdu 610209, Peoples R China

[5] Univ Elect Sci & Technol China, Sch Automat Engn, Chengdu 611731, Peoples R China

来源：

KNOWLEDGE-BASED SYSTEMS | 2024年 / 283卷

关键词：

Visual tracking; Siamese trackers; Sequence prediction; Spatio-temporal model;

D O I：

10.1016/j.knosys.2023.111206

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Similarity-based approaches have made significant progress in visual object tracking (VOT). Although these methods work well in simple scenes, they ignore the continuous spatio-temporal connection of the object in the video sequence. For this reason, tracking by spatial matching solely can lead to tracking failures because of distractors and occlusion. In this paper, we propose a spatio-temporal joint-modeling tracker named STTrack which implicitly builds continuous connections between the temporal and spatial aspects of the sequence. Specifically, we first design a time-sequence iteration strategy (TSIS) to concentrate on the temporal connection of the object in the video sequence. Then, we propose a novel spatial temporal interaction Transformer network (STIN) to capture the spatio-temporal correlation of the object between frames. The proposed STIN module is robust in object occlusion because it explores the dynamic state change dependencies of the object. Finally, we introduce a spatio-temporal query to suppress distractors by iteratively propagating the target prior. Extensive experiments on six tracking benchmark datasets demonstrate that the proposed STTrack achieves excellent performance while operating in real-time. The code is publicly available at https://github.com/nubsym/STTrack.

引用

页数：10

共 50 条

[21] Adaptive Spatio-Temporal Context Learning for Visual Target Tracking
Marvasti-Zadeh, Seyed Mojtaba
Ghanei-Yakhdan, Hossein
Kasaei, Shohreh
2017 10TH IRANIAN CONFERENCE ON MACHINE VISION AND IMAGE PROCESSING (MVIP), 2017, : 10 - 14
[22] Memory Prompt for Spatio-Temporal Transformer Visual Object Tracking
Xu T.
Wu X.
Zhu X.
Kittler J.
IEEE Transactions on Artificial Intelligence, 2024, 5 (08): : 1 - 6
[23] Robust Visual Tracking with Dual Spatio-Temporal Context Trackers
Sun, Shiyan
Zhang, Hong
Yuan, Ding
SEVENTH INTERNATIONAL CONFERENCE ON GRAPHIC AND IMAGE PROCESSING (ICGIP 2015), 2015, 9817
[24] Tracking and Modeling of Spatio-Temporal Fields with a Mobile Sensor Network
Lu, Bowen
Gu, Dongbing
Hu, Huosheng
2014 11TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2014, : 2711 - 2716
[25] Airborne target tracking based on spatio-temporal saliency modeling
Zhang W.
Zhong S.
Wang J.
Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica, 2020, 41 (03):
[26] Modeling and tracking of spatio-temporal scalar fields with multiple robots
Yan, Chuanbo
Zhang, Tao
PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE (CCC 2017), 2017, : 8548 - 8552
[27] Aberrance suppressed spatio-temporal correlation filters for visual object tracking
Elayaperumal, Dinesh
Joo, Young Hoon
PATTERN RECOGNITION, 2021, 115
[28] Robust Online Learned Spatio-Temporal Context Model for Visual Tracking
Wen, Longyin
Cai, Zhaowei
Lei, Zhen
Yi, Dong
Li, Stan Z.
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2014, 23 (02) : 785 - 796
[29] Learning spatio-temporal context via hierarchical features for visual tracking
Cao, Yi
Ji, Hongbing
Zhang, Wenbo
Xue, Fei
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2018, 66 : 50 - 65
[30] DASTSiam: Spatio-temporal fusion and discriminative enhancement for Siamese visual tracking
Huang, Yucheng
Firkat, Eksan
Zhang, Jinlai
Zhu, Lijuan
Zhu, Bin
Zhu, Jihong
Hamdulla, Askar
IET COMPUTER VISION, 2023, 17 (08) : 1017 - 1033

← 1 2 3 4 5 →