STMT: Spatio-temporal memory transformer for multi-object tracking

被引:0
|
作者
Songbo Gu
Jianxin Ma
Guancheng Hui
Qiyang Xiao
Wentao Shi
机构
[1] Henan University,School of Artificial Intelligence
[2] Northwestern Polytechnical University,School of Marine Science and Technology
来源
Applied Intelligence | 2023年 / 53卷
关键词
Deep learning; Multi-object tracking; Transformer; Memory; Spatio-temporal;
D O I
暂无
中图分类号
学科分类号
摘要
Typically, modern online Multi-Object Tracking (MOT) methods first obtain the detected objects in each frame and then establish associations between them in successive frames. However, it is difficult to obtain high-quality trajectories when camera motion, fast motion, and occlusion challenges occur. To address these problems, this paper proposes a transformer-based MOT system named Spatio-Temporal Memory Transformer (STMT), which focuses on time and history information. The proposed STMT consists of a Spatio-Temporal Enhancement Module (STEM) that uses 3D convolution to model the spatial and temporal interactions of objects and obtains rich features in spatio-temporal information. Moreover, a Dynamic Spatio-Temporal Memory (DSTM) is presented to associate detections with tracklets and contains three units: an Identity Aggregation Module (IAM), a Linear Dynamic Encoder (LD-Encoder) and a memory Decoder (Decoder). The IAM utilizes the geometric changes of objects to reduce the impact of deformation on tracking performance, the LD-Encoder is used to obtain the dependency between objects, and the Decoder generates appearance similarity scores. Furthermore, a Score Fusion Equilibrium Strategy (SFES) is employed to balance the similarity and position distance fusion scores. Extensive experiments demonstrate that the proposed STMT approach is generally superior to the state-of-the-art trackers on the MOT16 and MOT17 benchmarks.
引用
收藏
页码:23426 / 23441
页数:15
相关论文
共 50 条
  • [1] STMT: Spatio-temporal memory transformer for multi-object tracking
    Gu, Songbo
    Ma, Jianxin
    Hui, Guancheng
    Xiao, Qiyang
    Shi, Wentao
    [J]. APPLIED INTELLIGENCE, 2023, 53 (20) : 23426 - 23441
  • [2] Learning Spatio-Temporal Information for Multi-Object Tracking
    Wei, Jian
    Yang, Mei
    Liu, Feng
    [J]. IEEE ACCESS, 2017, 5 : 3869 - 3877
  • [3] STAT: Multi-Object Tracking Based on Spatio-Temporal Topological Constraints
    Zhang, Junjie
    Wang, Mingyan
    Jiang, Haoran
    Zhang, Xinyu
    Yan, Chenggang
    Zeng, Dan
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 4445 - 4457
  • [4] Spatio-Temporal Correlation Graph for Association Enhancement in Multi-object Tracking
    Zhong, Zhijie
    Sheng, Hao
    Zhang, Yang
    Wu, Yubin
    Chen, Jiahui
    Ke, Wei
    [J]. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2019, PT I, 2019, 11775 : 394 - 405
  • [5] Spatio-temporal object detection by deep learning: Video-interlacing to improve multi-object tracking
    Mhalla, Ala
    Chateau, Thierry
    Ben Amara, Najoua Essoukri
    [J]. IMAGE AND VISION COMPUTING, 2019, 88 : 120 - 131
  • [6] Spatio-temporal hierarchical feature transformer for UAV object tracking
    Zhu, Fuzhen
    Cui, Jingyi
    Dou, Kaiqi
    [J]. ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2023, 204 : 442 - 452
  • [7] Efficient Multi-object Detection for Complexity Spatio-Temporal Scenes
    Wang, Kai
    Song, Xiangyu
    Sun, Shijie
    Zhao, Juan
    Xu, Cai
    Song, Huansheng
    [J]. WEB AND BIG DATA, PT IV, APWEB-WAIM 2023, 2024, 14334 : 186 - 200
  • [8] ShaSTA: Modeling Shape and Spatio-Temporal Affinities for 3D Multi-Object Tracking
    Sadjadpour, Tara
    Li, Jie
    Ambrus, Rares
    Bohg, Jeannette
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (05): : 4273 - 4280
  • [9] Exploring reliable infrared object tracking with spatio-temporal fusion transformer
    Qi, Meibin
    Wang, Qinxin
    Zhuang, Shuo
    Zhang, Ke
    Li, Kunyuan
    Liu, Yimin
    Yang, Yanfang
    [J]. KNOWLEDGE-BASED SYSTEMS, 2024, 284
  • [10] Learning Spatio-Temporal Transformer for Visual Tracking
    Yan, Bin
    Peng, Houwen
    Fu, Jianlong
    Wang, Dong
    Lu, Huchuan
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 10428 - 10437