Unsupervised Video Object Segmentation via Weak User Interaction and Temporal Modulation

被引:0
|
作者
FAN Jiaqing [1 ]
ZHANG Kaihua [2 ,3 ]
ZHAO Yaqian [4 ]
LIU Qingshan [2 ,3 ]
机构
[1] College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics
[2] College of Computer and Software, Nanjing University of Information Science and Technology
[3] Engineering Research Center of Digital Forensics, Ministry of Education
[4] Inspur Suzhou Intelligent Technology Corporation
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP391.41 [];
学科分类号
080203 ;
摘要
In unsupervised video object segmentation(UVOS), the whole video might segment the wrong target due to the lack of initial prior information. Also, in semi-supervised video object segmentation(SVOS), the initial video frame with a fine-grained pixel-level mask is essential to good segmentation accuracy. It is expensive and laborious to provide the accurate pixel-level masks for each training sequence. To address this issue, We present a weak user interactive UVOS approach guided by a simple human-made rectangle annotation in the initial frame. We first interactively draw the region of interest by a rectangle, and then we leverage the mask RCNN(region-based convolutional neural networks) method to generate a set of coarse reference labels for subsequent mask propagations. To establish the temporal correspondence between the coherent frames, we further design two novel temporal modulation modules to enhance the target representations. We compute the earth mover’s distance(EMD)-based similarity between coherent frames to mine the co-occurrent objects in the two images, which is used to modulate the target representation to highlight the foreground target. We design a cross-squeeze temporal modulation module to emphasize the co-occurrent features across frames, which further helps to enhance the foreground target representation. We augment the temporally modulated representations with the original representation and obtain the compositive spatio-temporal information, producing a more accurate video object segmentation(VOS) model. The experimental results on both UVOS and SVOS datasets including Davis2016,FBMS, Youtube-VOS, and Davis2017, show that our method yields favorable accuracy and complexity. The related code is available.
引用
收藏
页码:507 / 518
页数:12
相关论文
共 50 条
  • [41] Multi-Attention Network for Unsupervised Video Object Segmentation
    Zhang, Guifang
    Wong, Hon-Cheng
    Lo, Sio-Long
    IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 71 - 75
  • [42] Learning Motion Guidance for Efficient Unsupervised Video Object Segmentation
    Zhao Z.-C.
    Zhang K.-H.
    Fan J.-Q.
    Liu Q.-S.
    Zidonghua Xuebao/Acta Automatica Sinica, 2023, 49 (04): : 872 - 880
  • [43] Learning Unsupervised Video Object Segmentation through Visual Attention
    Wang, Wenguan
    Song, Hongmei
    Zhao, Shuyang
    Shen, Jianbing
    Zhao, Sanyuan
    Hoi, Steven C. H.
    Ling, Haibin
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3059 - 3069
  • [44] A neural network based scheme for unsupervised video object segmentation
    Doulamis, AD
    Doulamis, ND
    Kollias, SD
    1998 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING - PROCEEDINGS, VOL 2, 1998, : 632 - 636
  • [45] Flow-Edge Guided Unsupervised Video Object Segmentation
    Zhou, Yifeng
    Xu, Xing
    Shen, Fumin
    Zhu, Xiaofeng
    Shen, Heng Tao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (12) : 8116 - 8127
  • [46] Temporally Efficient Gabor Transformer for Unsupervised Video Object Segmentation
    Fan, Jiaqing
    Su, Tiankang
    Zhang, Kaihua
    Liu, Bo
    Liu, Qingshan
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3394 - 3402
  • [47] Unsupervised Online Video Object Segmentation With Motion Property Understanding
    Zhuo, Tao
    Cheng, Zhiyong
    Zhang, Peng
    Wong, Yongkang
    Kankanhalli, Mohan
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 237 - 249
  • [48] Unsupervised video object segmentation using conditional random fields
    Bhatti, Asma Hamza
    Rahman, Anis Ur
    Butt, Asad Anwar
    SIGNAL IMAGE AND VIDEO PROCESSING, 2019, 13 (01) : 9 - 16
  • [49] Unsupervised video object segmentation: an affinity and edge learning approach
    Muthu, Sundaram
    Tennakoon, Ruwan
    Hoseinnezhad, Reza
    Bab-Hadiashar, Alireza
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2022, 13 (11) : 3589 - 3605
  • [50] Hierarchical Feature Alignment Network for Unsupervised Video Object Segmentation
    Pei, Gensheng
    Shen, Fumin
    Yao, Yazhou
    Xie, Guo-Sen
    Tang, Zhenmin
    Tang, Jinhui
    COMPUTER VISION, ECCV 2022, PT XXXIV, 2022, 13694 : 596 - 613