Referring Multi-Object Tracking

被引:16
|
作者
Wu, Dongming [1 ]
Han, Wencheng [2 ]
Wang, Tiancai [3 ]
Dong, Xingping [4 ]
Zhang, Xiangyu [3 ,5 ]
Shen, Jianbing [2 ]
机构
[1] Beijing Inst Technol, Beijing, Peoples R China
[2] Univ Macau, SKL IOTSC, CIS, Macau, Peoples R China
[3] MEGVII Technol, Beijing, Peoples R China
[4] Wuhan Univ, Sch Comp Sci, Wuhan, Peoples R China
[5] Beijing Acad Artificial Intelligence, Beijing, Peoples R China
关键词
D O I
10.1109/CVPR52729.2023.01406
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing referring understanding tasks tend to involve the detection of a single text-referred object. In this paper, we propose a new and general referring understanding task, termed referring multi-object tracking (RMOT). Its core idea is to employ a language expression as a semantic cue to guide the prediction of multi-object tracking. To the best of our knowledge, it is the first work to achieve an arbitrary number of referent object predictions in videos. To push forward RMOT, we construct one benchmark with scalable expressions based on KITTI, named Refer-KITTI. Specifically, it provides 18 videos with 818 expressions, and each expression in a video is annotated with an average of 10.7 objects. Further, we develop a transformer-based architecture TransRMOT to tackle the new task in an online manner, which achieves impressive detection performance and outperforms other counterparts. The Refer-KITTI dataset and the code are released at https://referringmot.github.io.
引用
收藏
页码:14633 / 14642
页数:10
相关论文
共 50 条
  • [1] EchoTrack: Auditory Referring Multi-Object Tracking for Autonomous Driving
    Lin, Jiacheng
    Chen, Jiajun
    Peng, Kunyu
    He, Xuan
    Li, Zhiyong
    Stiefelhagen, Rainer
    Yang, Kailun
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024,
  • [2] ROMOT: Referring-expression-comprehension open-set multi-object tracking
    Li, Wei
    Li, Bowen
    Wang, Jingqi
    Meng, Weiliang
    Zhang, Jiguang
    Zhang, Xiaopeng
    [J]. VISUAL COMPUTER, 2024,
  • [3] Multi-object trajectory tracking
    Han, Mei
    Xu, Wei
    Tao, Hai
    Gong, Yihong
    [J]. MACHINE VISION AND APPLICATIONS, 2007, 18 (3-4) : 221 - 232
  • [4] Multi-object tracking in video
    Agbinya, JI
    Rees, D
    [J]. REAL-TIME IMAGING, 1999, 5 (05) : 295 - 304
  • [5] Multi-object trajectory tracking
    Mei Han
    Wei Xu
    Hai Tao
    Yihong Gong
    [J]. Machine Vision and Applications, 2007, 18 : 221 - 232
  • [6] MULTI-OBJECT TRACKING AS ATTENTION MECHANISM
    Fukui, Hiroshi
    Miyagawa, Taiki
    Morishita, Yusuke
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 505 - 509
  • [7] Multi-object tracking of human spermatozoa
    Sorensen, Lauge
    Ostergaard, Jakob
    Johansen, Peter
    de Bruijne, Marleen
    [J]. MEDICAL IMAGING 2008: IMAGE PROCESSING, PTS 1-3, 2008, 6914
  • [8] MOTS: Multi-Object Tracking and Segmentation
    Voigtlaender, Paul
    Krause, Michael
    Osep, Aljosa
    Luiten, Jonathon
    Sekar, Berin Balachandar Gnana
    Geiger, Andreas
    Leibe, Bastian
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7934 - 7943
  • [9] Interacting Tracklets for Multi-Object Tracking
    Lan, Long
    Wang, Xinchao
    Zhang, Shiliang
    Tao, Dacheng
    Gao, Wen
    Huang, Thomas S.
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (09) : 4585 - 4597
  • [10] TrackFormer: Multi-Object Tracking with Transformers
    Meinhardt, Tim
    Kirillov, Alexander
    Leal-Taixe, Laura
    Feichtenhofer, Christoph
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 8834 - 8844