Coherence-aware context aggregator for fast video object segmentation

被引:23
|
作者
Lan, Meng [1 ]
Zhang, Jing [2 ]
Wang, Zengmao [1 ]
机构
[1] Wuhan Univ, Sch Comp Sci, Wuhan, Peoples R China
[2] Univ Sydney, Sch Comp Sci, Camperdown, Australia
基金
中国国家自然科学基金;
关键词
Video object segmentation; Semi-supervised learning; Spatio-temporal representation; Context;
D O I
10.1016/j.patcog.2022.109214
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semi-supervised video object segmentation (VOS) is a highly challenging problem that has attracted much research attention in recent years. Temporal context plays an important role in VOS by providing object clues from the past frames. However, most of the prevailing methods directly use the predicted temporal results to guide the segmentation of the current frame, while ignoring the coherence of tem-poral context, which may be misleading and degrade the performance. In this paper, we propose a novel model named Coherence-aware Context Aggregator (CCA) for VOS, which consists of three modules. First, a coherence-aware module (CAM) is proposed to evaluate the coherence of the predicted result of the current frame and then fuses the coherent features to update the temporal context. CAM can determine whether the prediction is accurate, thus guiding the update of the temporal context and avoiding the introduction of erroneous information. Second, we devise a spatio-temporal context aggregation (STCA) module to aggregate the temporal context with the spatial feature of the current frame to learn a robust and discriminative target representation in the decoder part. Third, we design a refinement module to refine the coarse feature generated from the STCA module for more precise segmentation. Additionally, CCA uses a cropping strategy and takes small-size images as input, thus making it computationally ef-ficient and achieving a real-time running speed. Extensive experiments on four challenging benchmarks show that CCA achieves a better trade-off between efficiency and accuracy compared to state-of-the-art methods. The code will be public. (c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Adaptive video object proposals by a context-aware model
    Geng, Wenjing
    Zhang, Chunlong
    Wu, Gangshan
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (09) : 10589 - 10614
  • [32] Adaptive video object proposals by a context-aware model
    Wenjing Geng
    Chunlong Zhang
    Gangshan Wu
    Multimedia Tools and Applications, 2018, 77 : 10589 - 10614
  • [33] High Dynamic Range Imaging Based on Coherence-Aware Feature Aggregation
    Yin, Jia-Li
    Han, Jin
    Chen, Bin
    Liu, Xi-Meng
    Jisuanji Xuebao/Chinese Journal of Computers, 2024, 47 (10): : 2352 - 2367
  • [34] Lightweight video object segmentation: Integrating online knowledge distillation for fast segmentation
    Hou, Zhiqiang
    Wang, Chenxu
    Ma, Sugang
    Dong, Jiale
    Wang, Yunchen
    Yu, Wangsheng
    Yang, Xiaobao
    KNOWLEDGE-BASED SYSTEMS, 2025, 308
  • [35] Spectral Context Matching for Video Object Segmentation Under Occlusion
    Shi, Xiaoxue
    Lu, Yao
    Zhou, Tianfei
    Lei, Xiaoyu
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2017, PT II, 2018, 10736 : 337 - 346
  • [36] Spatial Coherence-Aware Multi-Channel Wind Noise Reduction
    Mirabilii, Daniele
    Habetse, Emanuel A. P.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1974 - 1987
  • [37] CONTEXT PROPAGATION FROM PROPOSALS FOR SEMANTIC VIDEO OBJECT SEGMENTATION
    Wang, Tinghuai
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 256 - 260
  • [38] Video Object Segmentation via Global Consistency Aware Query Strategy
    Luo, Bing
    Li, Hongliang
    Meng, Fanman
    Wu, Qingbo
    Huang, Chao
    IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (07) : 1482 - 1493
  • [39] Learning Quality-aware Dynamic Memory for Video Object Segmentation
    Liu, Yong
    Yu, Ran
    Yin, Fei
    Zhao, Xinyuan
    Zhao, Wei
    Xia, Weihao
    Yang, Yujiu
    COMPUTER VISION, ECCV 2022, PT XXIX, 2022, 13689 : 468 - 486
  • [40] FAST TEXTURE SEGMENTATION FOR OBJECT-ORIENTED VIDEO CODING
    LAVAGETTO, F
    COCURULLO, F
    EUROPEAN TRANSACTIONS ON TELECOMMUNICATIONS, 1995, 6 (03): : 241 - 253