Spatial constraint for efficient semi-supervised video object segmentation

被引:1
|
作者
Chen, Yadang [1 ,2 ]
Ji, Chuanjun [1 ,2 ]
Yang, Zhi-Xin [3 ,4 ]
Wu, Enhua [5 ]
机构
[1] Nanjing Univ Informat Sci & Technol, Engn Res Ctr Digital Forens, Minist Educ, Nanjing 210044, Peoples R China
[2] Nanjing Univ Informat Sci & Technol, Sch Comp & Software, Nanjing 210044, Peoples R China
[3] Univ Macau, State Key Lab Internet Things Smart City, Macau 999078, Peoples R China
[4] Univ Macau, Dept Electromech Engn, Macau 999078, Peoples R China
[5] Univ Chinese Acad Sci, Inst Software, State Key Lab Comp Sci, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Video object segmentation; Memory-based methods; Redundant information; Semantically similar objects;
D O I
10.1016/j.cviu.2023.103843
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semi-supervised video object segmentation is the process of tracking and segmenting objects in a video sequence based on annotated masks for one or more frames. Recently, memory-based methods have attracted a significant amount of attention due to their strong performance. Having too much redundant information stored in memory, however, makes such methods inefficient and inaccurate. Moreover, a global matching strategy is usually used for memory reading, so these methods are susceptible to interference from semantically similar objects and are prone to incorrect segmentation. We propose a spatial constraint network to overcome these problems. In particular, we introduce a time-varying sensor and a dynamic feature memory to adaptively store pixel information to facilitate the modeling of the target object, which greatly reduces information redundancy in the memory without missing critical information. Furthermore, we propose an efficient memory reader that is less computationally intensive and has a smaller footprint. More importantly, we introduce a spatial constraint module to learn spatial consistency to obtain more precise segmentation; the target and distractors can be identified by the learned spatial response. The experimental results indicate that our method is competitive with state-of-the-art methods on several benchmark datasets. Our method also achieves an approximately 30 FPS inference speed, which is close to the requirement for real-time systems.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Semi-Supervised Video Object Segmentation with Super-Trajectories
    Wang, Wenguan
    Shen, Jianbing
    Porikli, Fatih
    Yang, Ruigang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (04) : 985 - 998
  • [2] Semi-supervised Video Object Segmentation with Recurrent Neural Network
    Ren, Xuanguang
    Pan, Han
    Jing, Zhongliang
    Gao, Lei
    CONFERENCE PROCEEDINGS OF 2019 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (IEEE ICSPCC 2019), 2019,
  • [3] Separable Structure Modeling for Semi-Supervised Video Object Segmentation
    Zhu, Wencheng
    Li, Jiahao
    Lu, Jiwen
    Zhou, Jie
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (01) : 330 - 344
  • [4] Learning Object Deformation and Motion Adaption for Semi-supervised Video Object Segmentation
    Zheng, Xiaoyang
    Tan, Xin
    Guo, Jianming
    Ma, Lizhuang
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 8655 - 8662
  • [5] Spatio-temporal compression for semi-supervised video object segmentation
    Ji, Chuanjun
    Chen, Yadang
    Yang, Zhi-Xin
    Wu, Enhua
    VISUAL COMPUTER, 2023, 39 (10): : 4929 - 4942
  • [6] SiamPolar: Semi-supervised realtime video object segmentation with polar representation
    Li, Yaochen
    Hong, Yuhui
    Song, Yonghong
    Zhu, Chao
    Zhang, Ying
    Wang, Ruihao
    NEUROCOMPUTING, 2022, 467 : 491 - 503
  • [7] CapsuleVOS: Semi-Supervised Video Object Segmentation Using Capsule Routing
    Duarte, Kevin
    Rawat, Yogesh S.
    Shah, Mubarak
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8479 - 8488
  • [8] Semi-supervised Video Object Segmentation Using Parallel Coattention Network
    Chakraborty, Sangramjit
    Mahapatra, Monalisha
    Nandy, Anup
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2023, 2023, 14301 : 449 - 456
  • [9] Spatio-temporal compression for semi-supervised video object segmentation
    Chuanjun Ji
    Yadang Chen
    Zhi-Xin Yang
    Enhua Wu
    The Visual Computer, 2023, 39 : 4929 - 4942
  • [10] A Semi-Supervised Video Object Segmentation Method Based on ConvNext and Unet
    Han, Dan
    Xiao, Yuelei
    Zhan, Pengyu
    Li, Tao
    Fan, Mengyu
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 7425 - 7431