STMT: Spatio-temporal memory transformer for multi-object tracking

被引:2
|
作者
Gu, Songbo [1 ]
Ma, Jianxin [1 ]
Hui, Guancheng [1 ]
Xiao, Qiyang [1 ]
Shi, Wentao [2 ]
机构
[1] Henan Univ, Sch Artificial Intelligence, Zhengzhou 450001, Henan, Peoples R China
[2] Northwestern Polytech Univ, Sch Marine Sci & Technol, Xian 710072, Shanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep learning; Multi-object tracking; Transformer; Memory; Spatio-temporal;
D O I
10.1007/s10489-023-04617-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Typically, modern online Multi-Object Tracking (MOT) methods first obtain the detected objects in each frame and then establish associations between them in successive frames. However, it is difficult to obtain high-quality trajectories when camera motion, fast motion, and occlusion challenges occur. To address these problems, this paper proposes a transformer-based MOT system named Spatio-Temporal Memory Transformer (STMT), which focuses on time and history information. The proposed STMT consists of a Spatio-Temporal Enhancement Module (STEM) that uses 3D convolution to model the spatial and temporal interactions of objects and obtains rich features in spatio-temporal information. Moreover, a Dynamic Spatio-Temporal Memory (DSTM) is presented to associate detections with tracklets and contains three units: an Identity Aggregation Module (IAM), a Linear Dynamic Encoder (LD-Encoder) and a memory Decoder (Decoder). The IAM utilizes the geometric changes of objects to reduce the impact of deformation on tracking performance, the LD-Encoder is used to obtain the dependency between objects, and the Decoder generates appearance similarity scores. Furthermore, a Score Fusion Equilibrium Strategy (SFES) is employed to balance the similarity and position distance fusion scores. Extensive experiments demonstrate that the proposed STMT approach is generally superior to the state-of-the-art trackers on the MOT16 and MOT17 benchmarks.
引用
收藏
页码:23426 / 23441
页数:16
相关论文
共 50 条
  • [21] Spatio-Temporal Multi-Task Learning Transformer for Joint Moving Object Detection and Segmentation
    Mohamed, Eslam
    El Sallab, Ahmad
    [J]. 2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 1470 - 1475
  • [22] ViT Spatio-Temporal Feature Fusion for Aerial Object Tracking
    Guo, Chuangye
    Liu, Kang
    Deng, Donghu
    Li, Xuelong
    [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34 (08) : 6749 - 6761
  • [23] UAV Visual Object Tracking Based on Spatio-Temporal Context
    He, Yongxiang
    Chao, Chuang
    Zhang, Zhao
    Guo, Hongwu
    Ma, Jianjun
    [J]. Drones, 2024, 8 (12)
  • [24] Unified spatio-temporal attention mixformer for visual object tracking
    Park, Minho
    Yoon, Gang-Joon
    Song, Jinjoo
    Yoon, Sang Min
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 134
  • [25] SiamST: Siamese network with spatio-temporal awareness for object tracking
    Zhang, Hong
    Xing, Wanli
    Yang, Yifan
    Li, Yan
    Yuan, Ding
    [J]. INFORMATION SCIENCES, 2023, 634 : 122 - 139
  • [26] MULTIPLE OBJECT TRACKING BY HIERARCHICAL ASSOCIATION OF SPATIO-TEMPORAL DATA
    Beleznai, Csaba
    Schreiber, David
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, 2010, : 41 - 44
  • [27] Asymmetric Deformable Spatio-temporal Framework for Infrared Object Tracking
    Wu, Jingjing
    Zhou, Xi
    Li, Xiaohong
    Liu, Hao
    Qi, Meibin
    Hong, Richang
    [J]. ACM Transactions on Multimedia Computing, Communications and Applications, 2024, 20 (10)
  • [28] Spatio-Temporal Discriminative Correlation Filter Based Object Tracking
    Xu, Zheng
    Zhu, Songhao
    Sun, Peng
    Guo, Wenbo
    [J]. PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 5284 - 5288
  • [29] FFTransMOT: Feature-Fused Transformer for Enhanced Multi-Object Tracking
    Hu, Xufeng
    Jeon, Younghoon
    Gwak, Jeonghwan
    [J]. IEEE ACCESS, 2023, 11 : 130060 - 130071
  • [30] Spatial-Temporal Relation Networks for Multi-Object Tracking
    Xu, Jiarui
    Cao, Yue
    Zhang, Zheng
    Hu, Han
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 3987 - 3997