Video Object Segmentation without Temporal Information

被引:212
|
作者
Maninis, Kevis-Kokitsi [1 ]
Caelles, Sergi [1 ]
Chen, Yuhua [1 ]
Pont-Tuset, Jordi [1 ]
Leal-Taixe, Laura [2 ]
Cremers, Daniel [2 ]
Van Gool, Luc [1 ]
机构
[1] ETHZ, CH-8092 Zurich, Switzerland
[2] TUM, D-80333 Munich, Germany
基金
欧盟地平线“2020”;
关键词
Video object segmentation; convolutional neural networks; semantic segmentation; instance segmentation;
D O I
10.1109/TPAMI.2018.2838670
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video Object Segmentation, and video processing in general, has been historically dominated by methods that rely on the temporal consistency and redundancy in consecutive video frames. When the temporal smoothness is suddenly broken, such as when an object is occluded, or some frames are missing in a sequence, the result of these methods can deteriorate significantly. This paper explores the orthogonal approach of processing each frame independently, i.e., disregarding the temporal information. In particular, it tackles the task of semi-supervised video object segmentation: the separation of an object from the background in a video, given its mask in the first frame. We present Semantic One-Shot Video Object Segmentation (OSVOSS), based on a fully-convolutional neural network architecture that is able to successively transfer generic semantic information, learned on ImageNet, to the task of foreground segmentation, and finally to learning the appearance of a single annotated object of the test sequence (hence one shot). We show that instance-level semantic information, when combined effectively, can dramatically improve the results of our previous method, OSVOS. We perform experiments on two recent single-object video segmentation databases, which show that OSVOSS is both the fastest and most accurate method in the state of the art. Experiments on multi-object video segmentation show that OSVOSS obtains competitive results.
引用
收藏
页码:1515 / 1530
页数:16
相关论文
共 50 条
  • [41] Unified Spatio-Temporal Dynamic Routing for Efficient Video Object Segmentation
    Dang, Jisheng
    Zheng, Huicheng
    Xu, Xiaohao
    Guo, Yulan
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (05) : 4512 - 4526
  • [42] Unsupervised Video Object Segmentation via Weak User Interaction and Temporal Modulation
    Fan Jiaqing
    Zhang Kaihua
    Zhao Yaqian
    Liu Qingshan
    CHINESE JOURNAL OF ELECTRONICS, 2023, 32 (03) : 507 - 518
  • [43] Spatio-temporal compression for semi-supervised video object segmentation
    Ji, Chuanjun
    Chen, Yadang
    Yang, Zhi-Xin
    Wu, Enhua
    VISUAL COMPUTER, 2023, 39 (10): : 4929 - 4942
  • [44] Coherency Based Spatio-Temporal Saliency Detection for Video Object Segmentation
    Mahapatra, Dwarikanath
    Gilani, Syed Omer
    Saini, Mukesh Kumar
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2014, 8 (03) : 454 - 462
  • [45] A New Spatio-Temporal Saliency-Based Video Object Segmentation
    Zhengzheng Tu
    Andrew Abel
    Lei Zhang
    Bin Luo
    Amir Hussain
    Cognitive Computation, 2016, 8 : 629 - 647
  • [46] Temporal segmentation of video objects for hierarchical object-based motion description
    Fu, Y
    Ekin, A
    Tekalp, AM
    Mehrotra, R
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2002, 11 (02) : 135 - 145
  • [47] A New Spatio-Temporal Saliency-Based Video Object Segmentation
    Tu, Zhengzheng
    Abel, Andrew
    Zhang, Lei
    Luo, Bin
    Hussain, Amir
    COGNITIVE COMPUTATION, 2016, 8 (04) : 629 - 647
  • [48] Object-based video segmentation using spatio-temporal energy
    Bao, HQ
    Zhang, ZY
    2004 7TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS 1-3, 2004, : 1260 - 1263
  • [49] Video Object of Interest Segmentation
    Zhou, Siyuan
    Zhan, Chunru
    Wang, Biao
    Ge, Tiezheng
    Jiang, Yuning
    Niu, Li
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 3, 2023, : 3805 - 3813
  • [50] Temporal stabilization of video object segmentation for 3D-TV applications
    Erdem, CE
    Ernst, F
    Redert, A
    Hendriks, E
    ICIP: 2004 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1- 5, 2004, : 357 - 360