Reducing the Annotation Effort for Video Object Segmentation Datasets

被引:1
|
作者
Voigtlaender, Paul [1 ]
Luo, Lishu [2 ]
Yuan, Chun [2 ]
Jiang, Yong [2 ]
Leibe, Bastian [1 ]
机构
[1] Rhein Westfal TH Aachen, Aachen, Germany
[2] Tsinghua Shenzhen Int Grad Sch, Shenzhen, Peoples R China
关键词
D O I
10.1109/WACV48630.2021.00310
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For further progress in video object segmentation (VOS), larger, more diverse, and more challenging datasets will be necessary. However, densely labeling every frame with pixel masks does not scale to large datasets. We use a deep convolutional network to automatically create pseudolabels on a pixel level from much cheaper bounding box annotations and investigate how far such pseudo-labels can carry us for training state-of-the-art VOS approaches. A very encouraging result of our study is that adding a manually annotated mask in only a single video frame for each object is sufficient to generate pseudo-labels which can be used to train a VOS method to reach almost the same performance level as when training with fully segmented videos. We use this workflow to create pixel pseudolabels for the training set of the challenging tracking dataset TAO, and we manually annotate a subset of the validation set. Together, we obtain the new TAO-VOS benchmark, which we make publicly available at www.vision. rwth-aachen.de/page/taovos. While the performance of state-of-the-art methods on existing datasets starts to saturate, TAO-VOS remains very challenging for current algorithms and reveals their shortcomings.
引用
下载
收藏
页码:3059 / 3068
页数:10
相关论文
共 50 条
  • [21] Object segmentation for video coding
    Chen, LH
    Chen, JR
    Liao, HY
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 3, PROCEEDINGS: IMAGE, SPEECH AND SIGNAL PROCESSING, 2000, : 383 - 386
  • [22] Video Object Segmentation: A Survey
    Sasithradevi, A.
    Roomi, S. Mohamed Mansoor
    Mareeswari, M.
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON COMMUNICATION AND ELECTRONICS SYSTEMS (ICCES), 2016, : 656 - 660
  • [23] Key-Frame Extraction for Reducing Human Effort in Object Detection Training for Video Surveillance
    Sinulingga, Hagai R.
    Kong, Seong G.
    ELECTRONICS, 2023, 12 (13)
  • [24] Accelerating Video Object Segmentation with Compressed Video
    Xu, Kai
    Yao, Angela
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1332 - 1341
  • [25] Human object annotation for surveillance video forensics
    Fraz, Muhammad
    Zafar, Iffat
    Tzanidou, Giounona
    Edirisinghe, Eran A.
    Sarfraz, Muhammad Saquib
    JOURNAL OF ELECTRONIC IMAGING, 2013, 22 (04)
  • [26] Generic object tracking for fast video annotation
    Trichet, Remi
    Merialdo, Bernard
    VISAPP 2007: PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOLUME IU/MTSV, 2007, : 419 - +
  • [27] Segmentation of Moving Objects in Traffic Video Datasets
    Aswath, Anusha
    Rameshan, Renu
    Krishnan, Biju
    Ponkumar, Senthil
    ICPRAM: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS, 2020, : 321 - 332
  • [28] Semi-Automatic Multi-Object Video Annotation Based on Tracking, Prediction and Semantic Segmentation
    Fernandez, Jaime B.
    Venkatesh, G. M.
    Zhang, Dian
    Little, Suzanne
    O'Connor, Noel E.
    2019 INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI), 2019,
  • [29] From global image annotation to interactive object segmentation
    Giro-i-Nieto, Xavier
    Martos, Manuel
    Mohedano, Eva
    Pont-Tuset, Jordi
    MULTIMEDIA TOOLS AND APPLICATIONS, 2014, 70 (01) : 475 - 493
  • [30] From global image annotation to interactive object segmentation
    Xavier Giró-i-Nieto
    Manuel Martos
    Eva Mohedano
    Jordi Pont-Tuset
    Multimedia Tools and Applications, 2014, 70 : 475 - 493