Joint Video Object Discovery and Segmentation by Coupled Dynamic Markov Networks

被引:13
|
作者
Liu, Ziyi [1 ]
Wang, Le [1 ]
Hua, Gang [2 ]
Zhang, Qilin [3 ]
Niu, Zhenxing [4 ]
Wu, Ying [5 ]
Zheng, Nanning [1 ]
机构
[1] Xi An Jiao Tong Univ, Inst Artificial Intelligence & Robot, Xian 710049, Shaanxi, Peoples R China
[2] Microsoft Res, Redmond, WA 98052 USA
[3] HERE Technol, Chicago, IL 60606 USA
[4] Alibaba Grp, Hangzhou 311121, Zhejiang, Peoples R China
[5] Northwestern Univ, Dept Elect Engn & Comp Sci, Evanston, IL 60208 USA
基金
中国博士后科学基金; 中国国家自然科学基金; 美国国家科学基金会;
关键词
Object segmentation; object discovery; dynamic Markov networks; probabilistic graphical model; CO-SEGMENTATION; OPTICAL-FLOW; RECOGNITION; HISTOGRAMS;
D O I
10.1109/TIP.2018.2859622
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It is a challenging task to extract segmentation mask of a target from a single noisy video, which involves object discovery coupled with segmentation. To solve this challenge, we present a method to jointly discover and segment an object from a noisy video, where the target disappears intermittently throughout the video. Previous methods either only fulfill video object discovery, or video object segmentation presuming the existence of the object in each frame. We argue that jointly conducting the two tasks in a unified way will be beneficial. In other words, video object discovery and video object segmentation tasks can facilitate each other. To validate this hypothesis, we propose a principled probabilistic model, where two dynamic Markov networks are coupled-one for discovery and the other for segmentation. When conducting the Bayesian inference on this model using belief propagation, the bi-directional message passing reveals a clear collaboration between these two inference tasks. We validated our proposed method in five data sets. The first three video data sets, i.e., the SegTrack data set, the YouTube-objects data set, and the Davis data set, are not noisy, where all video frames contain the objects. The two noisy data sets, i.e., the XJTU-Stevens data set, and the Noisy-ViDiSeg data set, newly introduced in this paper, both have many frames that do not contain the objects. When compared with state of the art, it is shown that although our method produces inferior results on video data sets without noisy frames, we are able to obtain better results on video data sets with noisy frames.
引用
收藏
页码:5840 / 5853
页数:14
相关论文
共 50 条
  • [1] Video Object Segmentation with Dynamic Memory Networks and Adaptive Object Alignment
    Liang, Shuxian
    Shen, Xu
    Huang, Jianqiang
    Hua, Xian-Sheng
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 8045 - 8054
  • [2] Segmentation Free Object Discovery in Video
    Cuffaro, Giovanni
    Becattini, Federico
    Baecchi, Claudio
    Seidenari, Lorenzo
    Del Bimbo, Alberto
    COMPUTER VISION - ECCV 2016 WORKSHOPS, PT III, 2016, 9915 : 25 - 31
  • [3] Video Segmentation with Joint Object and Trajectory Labeling
    Yang, Michael Ying
    Rosenhahn, Bodo
    2014 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2014, : 831 - 838
  • [4] Recurrent Dynamic Embedding for Video Object Segmentation
    Li, Mingxing
    Hu, Li
    Xiong, Zhiwei
    Zhang, Bang
    Pan, Pan
    Liu, Dong
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1322 - 1331
  • [5] Unsupervised Joint Object Discovery and Segmentation in Internet Images
    Rubinstein, Michael
    Joulin, Armand
    Kopf, Johannes
    Liu, Ce
    2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 1939 - 1946
  • [6] Prototypical Matching Networks for Video Object Segmentation
    Lin, Fanchao
    Qiu, Zhaofan
    Liu, Chuanbin
    Yao, Ting
    Xie, Hongtao
    Zhang, Yongdong
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 5623 - 5636
  • [7] Fast Video Object Segmentation Using Markov Random Field
    Mak, Chun-Man
    Cham, Wai-Kuen
    2008 IEEE 10TH WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, VOLS 1 AND 2, 2008, : 347 - 352
  • [8] Joint Inductive and Transductive Learning for Video Object Segmentation
    Mao, Yunyao
    Wang, Ning
    Zhou, Wengang
    Li, Houqiang
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9650 - 9659
  • [9] Joint Attention Mechanism for Unsupervised Video Object Segmentation
    Yao, Rui
    Xu, Xin
    Zhou, Yong
    Zhao, Jiaqi
    Fang, Liang
    PATTERN RECOGNITION AND COMPUTER VISION, PT I, 2021, 13019 : 154 - 165
  • [10] Adversarial Attacks on Video Object Segmentation With Hard Region Discovery
    Li, Ping
    Zhang, Yu
    Yuan, Li
    Zhao, Jian
    Xu, Xianghua
    Zhang, Xiaoqin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (06) : 5049 - 5062