See More, Know More: Unsupervised Video Object Segmentation with Co-Attention Siamese Networks

被引:418
|
作者
Lu, Xiankai [1 ]
Wang, Wenguan [1 ]
Ma, Chao [2 ]
Shen, Jianbing [1 ]
Shao, Ling [1 ]
Porikli, Fatih [3 ]
机构
[1] Incept Inst Artificial Intelligence, Abu Dhabi, U Arab Emirates
[2] Shanghai Jiao Tong Univ, AI Inst, MoE Key Lab Artificial Intelligence, Shanghai, Peoples R China
[3] Australian Natl Univ, Canberra, ACT, Australia
来源
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年
基金
澳大利亚研究理事会;
关键词
D O I
10.1109/CVPR.2019.00374
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a novel network, called CO-attention Siamese Network (COSNet), to address the unsupervised video object segmentation task from a holistic view. We emphasize the importance of inherent correlation among video frames and incorporate a global co-attention mechanism to improve further the state-of-the-art deep learning based solutions that primarily focus on learning discriminative foreground representations over appearance and motion in short-term temporal segments. The co-attention layers in our network provide efficient and competent stages for capturing global correlations and scene context by jointly computing and appending co-attention responses into a joint feature space. We train COSNet with pairs of video frames, which naturally augments training data and allows increased learning capacity. During the segmentation stage, the co-attention model encodes useful information by processing multiple reference frames together, which is leveraged to infer the frequently reappearing and salient foreground objects better. We propose a unified and end-to-end trainable framework where different co-attention variants can be derived for mining the rich context within videos. Our extensive experiments over three large benchmarks manifest that COSNet outperforms the current alternatives by a large margin.
引用
收藏
页码:3618 / 3627
页数:10
相关论文
共 35 条
  • [21] Adaptable neural networks for unsupervised video object segmentation of stereoscopic sequences
    Doulamis, AD
    Ntalianis, KS
    Doulamis, ND
    Kollias, SD
    ARTIFICIAL NEURAL NETWORKS-ICANN 2001, PROCEEDINGS, 2001, 2130 : 1060 - 1066
  • [22] Video object segmentation via attention-modulating networks
    Tang, Runfa
    Song, Huihui
    Zhang, Kaihua
    Jiang, Sihao
    ELECTRONICS LETTERS, 2019, 55 (08) : 455 - 456
  • [23] Video Object Segmentation Using Multi-Scale Attention-Based Siamese Network
    Zhu, Zhiliang
    Qiu, Leiningxin
    Wang, Jiaxin
    Xiong, Jinquan
    Peng, Hua
    ELECTRONICS, 2023, 12 (13)
  • [24] Saliency-based dual-attention network for unsupervised video object segmentation
    Zhang, Guifang
    Wong, Hon-Cheng
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (04): : 4996 - 5010
  • [25] Saliency-based dual-attention network for unsupervised video object segmentation
    Guifang Zhang
    Hon-Cheng Wong
    The Journal of Supercomputing, 2024, 80 (4) : 4996 - 5010
  • [26] Efficient Long-Short Temporal Attention network for unsupervised Video Object Segmentation
    Li, Ping
    Zhang, Yu
    Yuan, Li
    Xiao, Huaxin
    Lin, Binbin
    Xu, Xianghua
    PATTERN RECOGNITION, 2024, 146
  • [27] Weakly Supervised Few-shot Object Segmentation using Co-Attention with Visual and Semantic Embeddings
    Siam, Mennatullah
    Doraiswamy, Naren
    Oreshkin, Boris N.
    Yao, Hengshuai
    Jagersand, Martin
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 860 - 867
  • [28] See More and Know More: Zero-shot Point Cloud Segmentation via Multi-modal Visual Data
    Lu, Yuhang
    Jiang, Qi
    Chen, Runnan
    Hou, Yuenan
    Zhu, Xinge
    Ma, Yuexin
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 21617 - 21627
  • [29] Dual-stream Co-enhanced Network for Unsupervised Video Object Segmentation
    Zhu, Hongliang
    Yin, Hui
    Liu, Yanting
    Chen, Ning
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2024, 18 (04): : 938 - 958
  • [30] Unsupervised Point Cloud Object Co-segmentation by Co-contrastive Learning and Mutual Attention Sampling
    Yang, Cheng-Kun
    Chuang, Yung-Yu
    Lin, Yen-Yu
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 7315 - 7324