Unsupervised Video Object Segmentation via Weak User Interaction and Temporal Modulation

被引:0
|
作者
FAN Jiaqing [1 ]
ZHANG Kaihua [2 ,3 ]
ZHAO Yaqian [4 ]
LIU Qingshan [2 ,3 ]
机构
[1] College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics
[2] College of Computer and Software, Nanjing University of Information Science and Technology
[3] Engineering Research Center of Digital Forensics, Ministry of Education
[4] Inspur Suzhou Intelligent Technology Corporation
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP391.41 [];
学科分类号
080203 ;
摘要
In unsupervised video object segmentation(UVOS), the whole video might segment the wrong target due to the lack of initial prior information. Also, in semi-supervised video object segmentation(SVOS), the initial video frame with a fine-grained pixel-level mask is essential to good segmentation accuracy. It is expensive and laborious to provide the accurate pixel-level masks for each training sequence. To address this issue, We present a weak user interactive UVOS approach guided by a simple human-made rectangle annotation in the initial frame. We first interactively draw the region of interest by a rectangle, and then we leverage the mask RCNN(region-based convolutional neural networks) method to generate a set of coarse reference labels for subsequent mask propagations. To establish the temporal correspondence between the coherent frames, we further design two novel temporal modulation modules to enhance the target representations. We compute the earth mover’s distance(EMD)-based similarity between coherent frames to mine the co-occurrent objects in the two images, which is used to modulate the target representation to highlight the foreground target. We design a cross-squeeze temporal modulation module to emphasize the co-occurrent features across frames, which further helps to enhance the foreground target representation. We augment the temporally modulated representations with the original representation and obtain the compositive spatio-temporal information, producing a more accurate video object segmentation(VOS) model. The experimental results on both UVOS and SVOS datasets including Davis2016,FBMS, Youtube-VOS, and Davis2017, show that our method yields favorable accuracy and complexity. The related code is available.
引用
收藏
页码:507 / 518
页数:12
相关论文
共 50 条
  • [21] Evaluating quality of motion for unsupervised video object segmentation
    Cheng, Guanjun
    Song, Huihui
    OPTOELECTRONICS LETTERS, 2024, 20 (06) : 379 - 384
  • [22] Instance Embedding Transfer to Unsupervised Video Object Segmentation
    Li, Siyang
    Seybold, Bryan
    Vorobyov, Alexey
    Fathi, Alireza
    Huang, Qin
    Kuo, C. -C. Jay
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6526 - 6535
  • [23] UnOVOST: Unsupervised Offline Video Object Segmentation and Tracking
    Luiten, Jonathon
    Zulfikar, Idil Esen
    Leibe, Bastian
    2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 1989 - 1998
  • [24] Generalizable Fourier Augmentation for Unsupervised Video Object Segmentation
    Song, Huihui
    Su, Tiankang
    Zheng, Yuhui
    Zhang, Kaihua
    Liu, Bo
    Liu, Dong
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 5, 2024, : 4918 - 4924
  • [25] Asymmetric Attention Fusion for Unsupervised Video Object Segmentation
    Jiang, Hongfan
    Wu, Xiaojun
    Xu, Tianyang
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VI, 2024, 14430 : 170 - 182
  • [26] Joint Attention Mechanism for Unsupervised Video Object Segmentation
    Yao, Rui
    Xu, Xin
    Zhou, Yong
    Zhao, Jiaqi
    Fang, Liang
    PATTERN RECOGNITION AND COMPUTER VISION, PT I, 2021, 13019 : 154 - 165
  • [27] Dual Prototype Attention for Unsupervised Video Object Segmentation
    Cho, Suhwan
    Lee, Minhyeok
    Lee, Seunghoon
    Lee, Dogyoon
    Choi, Heeseung
    Kim, Ig-Jae
    Lee, Sangyoun
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 19238 - 19247
  • [28] Deep Transport Network for Unsupervised Video Object Segmentation
    Zhang, Kaihua
    Zhao, Zicheng
    Liu, Dong
    Liu, Qingshan
    Liu, Bo
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 8761 - 8770
  • [29] Unsupervised Video Object Segmentation for Deep Reinforcement Learning
    Goel, Vik
    Weng, Jameson
    Poupart, Pascal
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [30] Evaluating quality of motion for unsupervised video object segmentation
    CHENG Guanjun
    SONG Huihui
    Optoelectronics Letters, 2024, 20 (06) : 379 - 384