Reducing the Annotation Effort for Video Object Segmentation Datasets

被引:1
|
作者
Voigtlaender, Paul [1 ]
Luo, Lishu [2 ]
Yuan, Chun [2 ]
Jiang, Yong [2 ]
Leibe, Bastian [1 ]
机构
[1] Rhein Westfal TH Aachen, Aachen, Germany
[2] Tsinghua Shenzhen Int Grad Sch, Shenzhen, Peoples R China
关键词
D O I
10.1109/WACV48630.2021.00310
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For further progress in video object segmentation (VOS), larger, more diverse, and more challenging datasets will be necessary. However, densely labeling every frame with pixel masks does not scale to large datasets. We use a deep convolutional network to automatically create pseudolabels on a pixel level from much cheaper bounding box annotations and investigate how far such pseudo-labels can carry us for training state-of-the-art VOS approaches. A very encouraging result of our study is that adding a manually annotated mask in only a single video frame for each object is sufficient to generate pseudo-labels which can be used to train a VOS method to reach almost the same performance level as when training with fully segmented videos. We use this workflow to create pixel pseudolabels for the training set of the challenging tracking dataset TAO, and we manually annotate a subset of the validation set. Together, we obtain the new TAO-VOS benchmark, which we make publicly available at www.vision. rwth-aachen.de/page/taovos. While the performance of state-of-the-art methods on existing datasets starts to saturate, TAO-VOS remains very challenging for current algorithms and reveals their shortcomings.
引用
下载
收藏
页码:3059 / 3068
页数:10
相关论文
共 50 条
  • [1] Video object tracking and segmentation with box annotation
    Wang, Ye
    Choi, Jongmoo
    Zhang, Kaitai
    Huang, Qin
    Chen, Yueru
    Lee, Ming-Sui
    Kuo, C-C Jay
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2020, 85
  • [2] An interactive authoring system for video object segmentation and annotation
    Luo, HT
    Eleftheriadis, A
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2002, 17 (07) : 559 - 572
  • [3] Designing an interactive tool for video object segmentation and annotation
    Luo, HT
    Eleftheriadis, A
    ACM MULTIMEDIA 99, PROCEEDINGS, 1999, : 265 - 269
  • [4] Reducing human efforts in video segmentation annotation with reinforcement learning
    Varga, Viktor
    Lorincz, Andras
    NEUROCOMPUTING, 2020, 405 : 247 - 258
  • [5] Reducing Human Annotation Effort Using Self-supervised Learning for Image Segmentation
    Siriborvornratanakul, Thitirat
    ARTIFICIAL INTELLIGENCE IN HCI, PT I, AI-HCI 2024, 2024, 14734 : 436 - 445
  • [6] Object tracking for video annotation
    Zhang, SQ
    APPLICATIONS OF DIGITAL IMAGE PROCESSING XXVII, PTS 1AND 2, 2004, 5558 : 804 - 814
  • [7] Model assisted bootstrapping for annotation of segmentation datasets
    Batchelor, Oliver
    Green, Richard
    2017 INTERNATIONAL CONFERENCE ON IMAGE AND VISION COMPUTING NEW ZEALAND (IVCNZ), 2017,
  • [8] Video Object Annotation, Navigation, and Composition
    Goldman, Dan B.
    Gonterman, Chris
    Curless, Brian
    Salesin, David
    Seitz, Steven M.
    UIST 2008: PROCEEDINGS OF THE 21ST ANNUAL ACM SYMPOSIUM ON USER INTERFACE SOFTWARE AND TECHNOLOGY, 2008, : 3 - 12
  • [9] Interactive Video Object Mask Annotation
    Trung-Nghia Le
    Nguyen, Tam, V
    Quoc-Cuong Tran
    Lam Nguyen
    Trung-Hieu Hoang
    Minh-Quan Le
    Minh-Triet Tran
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 16067 - 16070
  • [10] Breaking the "Object" in Video Object Segmentation
    Tokmakov, Pavel
    Li, Jie
    Gaidon, Adrien
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 22836 - 22845