Coherence-aware context aggregator for fast video object segmentation

被引：23

作者：

Lan, Meng ^{[1
]}

Zhang, Jing ^{[2
]}

Wang, Zengmao ^{[1
]}

机构：

[1] Wuhan Univ, Sch Comp Sci, Wuhan, Peoples R China

[2] Univ Sydney, Sch Comp Sci, Camperdown, Australia

来源：

PATTERN RECOGNITION | 2023年 / 136卷

基金：

中国国家自然科学基金;

关键词：

Video object segmentation; Semi-supervised learning; Spatio-temporal representation; Context;

D O I：

10.1016/j.patcog.2022.109214

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Semi-supervised video object segmentation (VOS) is a highly challenging problem that has attracted much research attention in recent years. Temporal context plays an important role in VOS by providing object clues from the past frames. However, most of the prevailing methods directly use the predicted temporal results to guide the segmentation of the current frame, while ignoring the coherence of tem-poral context, which may be misleading and degrade the performance. In this paper, we propose a novel model named Coherence-aware Context Aggregator (CCA) for VOS, which consists of three modules. First, a coherence-aware module (CAM) is proposed to evaluate the coherence of the predicted result of the current frame and then fuses the coherent features to update the temporal context. CAM can determine whether the prediction is accurate, thus guiding the update of the temporal context and avoiding the introduction of erroneous information. Second, we devise a spatio-temporal context aggregation (STCA) module to aggregate the temporal context with the spatial feature of the current frame to learn a robust and discriminative target representation in the decoder part. Third, we design a refinement module to refine the coarse feature generated from the STCA module for more precise segmentation. Additionally, CCA uses a cropping strategy and takes small-size images as input, thus making it computationally ef-ficient and achieving a real-time running speed. Extensive experiments on four challenging benchmarks show that CCA achieves a better trade-off between efficiency and accuracy compared to state-of-the-art methods. The code will be public. (c) 2022 Elsevier Ltd. All rights reserved.

引用

页数：12

共 50 条

[1] Fast Context Adaptation for Video Object Segmentation
Dubuisson, Isidore
Muselet, Damien
Ducottet, Christophe
Lang, Jochen
COMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2023, PT I, 2023, 14184 : 273 - 283
[2] Context-aware Deformable Alignment for Video Object Segmentation
Yang, Jie
Xia, Mingfu
Zhou, Xue
2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 303 - 309
[3] Coherence-Aware Neural Topic Modeling
Ding, Ran
Nallapati, Ramesh
Xiang, Bing
2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 830 - 836
[4] COSMic: A Coherence-Aware Generation Metric for Image
Inan, Mert
Sharma, Piyush
Khalid, Baber
Soricut, Radu
Stone, Matthew
Alikhani, Malihe
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 3419 - 3430
[5] Fast object segmentation in unconstrained video
Papazoglou, Anestis
Ferrari, Vittorio
2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 1777 - 1784
[6] COHERENCE-AWARE STEREOPHONIC RESIDUAL ECHO ESTIMATION
Valero, Maria Luis
Yildiz, Ilkay
Mabande, Edwin
Habets, Emanuel A. P.
2017 HANDS-FREE SPEECH COMMUNICATIONS AND MICROPHONE ARRAYS (HSCMA 2017), 2017, : 176 - 180
[7] Context-Aware Relative Object Queries to Unify Video Instance and Panoptic Segmentation
Choudhuri, Anwesa
Chowdhary, Girish
Schwing, Alexander G.
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 6377 - 6386
[8] Saliency-Aware Video Object Segmentation
Wang, Wenguan
Shen, Jianbing
Yang, Ruigang
Porikli, Fatih
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (01) : 20 - 33
[9] Fast target-aware learning for few-shot video object segmentation
Yadang CHEN
Chuanyan HAO
Zhi-Xin YANG
Enhua WU
ScienceChina(InformationSciences), 2022, 65 (08) : 71 - 86
[10] Fast target-aware learning for few-shot video object segmentation
Chen, Yadang
Hao, Chuanyan
Yang, Zhi-Xin
Wu, Enhua
SCIENCE CHINA-INFORMATION SCIENCES, 2022, 65 (08)

← 1 2 3 4 5 →