Single-Shot and Multi-Shot Feature Learning for Multi-Object Tracking

被引:0
|
作者
Li, Yizhe [1 ,2 ]
Zhou, Sanping [1 ,2 ]
Qin, Zheng [1 ,2 ]
Wang, Le [1 ,2 ]
Wang, Jinjun [1 ,2 ]
Zheng, Nanning [1 ,2 ]
机构
[1] Xi An Jiao Tong Univ, Natl Key Lab Human Machine Hybrid Augmented Intell, Natl Engn Res Ctr Visual Informat & Applicat, Xian 710049, Peoples R China
[2] Xi An Jiao Tong Univ, Inst Artificial Intelligence & Robot, Xian 710049, Peoples R China
基金
中国博士后科学基金; 国家重点研发计划;
关键词
Target tracking; Feature extraction; Tracking; Representation learning; Object detection; Visualization; Task analysis; Multi-object tracking; discriminative feature learning; data association;
D O I
10.1109/TMM.2024.3394683
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-Object Tracking (MOT) remains a vital component of intelligent video analysis, which aims to locate targets and maintain a consistent identity for each target throughout a video sequence. Existing works usually learn a discriminative feature representation, such as motion and appearance, to associate the detections across frames, which are easily affected by mutual occlusion and background clutter in practice. In this paper, we propose a simple yet effective two-stage feature learning paradigm to jointly learn single-shot and multi-shot features for different targets, so as to achieve robust data association in the tracking process. For the detections without being associated, we design a novel single-shot feature learning module to extract discriminative features of each detection, which can efficiently associate targets between adjacent frames. For the tracklets being lost several frames, we design a novel multi-shot feature learning module to extract discriminative features of each tracklet, which can accurately refind these lost targets after a long period. Once equipped with a simple data association logic, the resulting VisualTracker can perform robust MOT based on the single-shot and multi-shot feature representations. Extensive experimental results demonstrate that our method has achieved significant improvements on MOT17 and MOT20 datasets while reaching state-of-the-art performance on DanceTrack dataset.
引用
收藏
页码:9515 / 9526
页数:12
相关论文
共 50 条
  • [31] Feature Compression for Multimodal Multi-Object Tracking
    Li, Xinlin
    Hanna, Osama A.
    Fragouli, Christina
    Diggavi, Suhas
    Verma, Gunjan
    Bhattacharyya, Joydeep
    MILCOM 2023 - 2023 IEEE MILITARY COMMUNICATIONS CONFERENCE, 2023,
  • [32] Multi-shot process is a team effort
    不详
    MOLDING SYSTEMS, 1998, 56 (05): : 41 - 43
  • [33] A method of single-shot target detection with multi-scale feature fusion and feature enhancement
    Qu, Zhong
    Shang, Xue
    Xia, Shu-Fang
    Yi, Tu-Ming
    Zhou, Dong-Yang
    IET IMAGE PROCESSING, 2022, 16 (06) : 1752 - 1763
  • [34] CSMOT: Make One-Shot Multi-Object Tracking in Crowded Scenes Great Again
    Hou, Haoxiong
    Shen, Chao
    Zhang, Ximing
    Gao, Wei
    SENSORS, 2023, 23 (07)
  • [35] A single-shot game of multi-period inspection
    Hohzaki, Ryusuke
    Maehara, Hiroki
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2010, 207 (03) : 1410 - 1418
  • [36] Coupled metric learning for single-shot versus single-shot person reidentification
    Li, Wei
    Wu, Yang
    Mukunoki, Masayuki
    Minoh, Michihiko
    OPTICAL ENGINEERING, 2013, 52 (02)
  • [37] Single-Shot Object Detection via Feature Enhancement and Channel Attention
    Li, Yi
    Wang, Lingna
    Wang, Zeji
    SENSORS, 2022, 22 (18)
  • [38] Deep multi-shot network for modelling appearance similarity in multi-person tracking applications
    Gomez-Silva, Maria J.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (15) : 23701 - 23721
  • [39] M2Det: A Single-Shot Object Detector Based on Multi-Level Feature Pyramid Network
    Zhao, Qijie
    Sheng, Tao
    Wang, Yongtao
    Tang, Zhi
    Chen, Ying
    Cai, Ling
    Ling, Haibin
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 9259 - 9266
  • [40] Multi-shot Temporal Event Localization: a Benchmark
    Liu, Xiaolong
    Hu, Yao
    Bai, Song
    Ding, Fei
    Bai, Xiang
    Torr, Philip H. S.
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 12591 - 12601