See More, Know More: Unsupervised Video Object Segmentation with Co-Attention Siamese Networks

被引:418
|
作者
Lu, Xiankai [1 ]
Wang, Wenguan [1 ]
Ma, Chao [2 ]
Shen, Jianbing [1 ]
Shao, Ling [1 ]
Porikli, Fatih [3 ]
机构
[1] Incept Inst Artificial Intelligence, Abu Dhabi, U Arab Emirates
[2] Shanghai Jiao Tong Univ, AI Inst, MoE Key Lab Artificial Intelligence, Shanghai, Peoples R China
[3] Australian Natl Univ, Canberra, ACT, Australia
来源
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年
基金
澳大利亚研究理事会;
关键词
D O I
10.1109/CVPR.2019.00374
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a novel network, called CO-attention Siamese Network (COSNet), to address the unsupervised video object segmentation task from a holistic view. We emphasize the importance of inherent correlation among video frames and incorporate a global co-attention mechanism to improve further the state-of-the-art deep learning based solutions that primarily focus on learning discriminative foreground representations over appearance and motion in short-term temporal segments. The co-attention layers in our network provide efficient and competent stages for capturing global correlations and scene context by jointly computing and appending co-attention responses into a joint feature space. We train COSNet with pairs of video frames, which naturally augments training data and allows increased learning capacity. During the segmentation stage, the co-attention model encodes useful information by processing multiple reference frames together, which is leveraged to infer the frequently reappearing and salient foreground objects better. We propose a unified and end-to-end trainable framework where different co-attention variants can be derived for mining the rich context within videos. Our extensive experiments over three large benchmarks manifest that COSNet outperforms the current alternatives by a large margin.
引用
收藏
页码:3618 / 3627
页数:10
相关论文
共 35 条
  • [1] Zero-Shot Video Object Segmentation With Co-Attention Siamese Networks
    Lu, Xiankai
    Wang, Wenguan
    Shen, Jianbing
    Crandall, David
    Luo, Jiebo
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (04) : 2228 - 2242
  • [2] Co-attention CNNs for Unsupervised Object Co-segmentation
    Hsu, Kuang-Jui
    Lin, Yen-Yu
    Chuang, Yung-Yu
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 748 - 756
  • [3] COMatchNet: Co-Attention Matching Network for Video Object Segmentation
    Huang, Lufei
    Sun, Fengming
    Yuan, Xia
    PATTERN RECOGNITION, ACPR 2021, PT I, 2022, 13188 : 271 - 284
  • [4] Co-attention Propagation Network for Zero-Shot Video Object Segmentation
    Pei, Gensheng
    Yao, Yazhou
    Shen, Fumin
    Huang, Dan
    Huang, Xingguo
    Shen, Heng-Tao
    arXiv, 2023,
  • [5] Hierarchical Co-Attention Propagation Network for Zero-Shot Video Object Segmentation
    Pei, Gensheng
    Yao, Yazhou
    Shen, Fumin
    Huang, Dan
    Huang, Xingguo
    Shen, Heng-Tao
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 2348 - 2359
  • [6] Fast Video Object Segmentation Based on Siamese Networks
    Fu L.-H.
    Zhao Y.
    Sun X.-W.
    Lu Z.-S.
    Wang D.
    Yang H.-X.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2020, 48 (04): : 625 - 630
  • [7] Learning Motion-Appearance Co-Attention for Zero-Shot Video Object Segmentation
    Yang, Shu
    Zhang, Lu
    Qi, Jinqing
    Lu, Huchuan
    Wang, Shuo
    Zhang, Xiaoxing
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1544 - 1553
  • [8] Guided Slot Attention for Unsupervised Video Object Segmentation
    Lee, Minhyeok
    Cho, Suhwan
    Lee, Dogyoon
    Park, Chaewon
    Lee, Jungho
    Lee, Sangyoun
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 3807 - 3816
  • [9] Asymmetric Attention Fusion for Unsupervised Video Object Segmentation
    Jiang, Hongfan
    Wu, Xiaojun
    Xu, Tianyang
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VI, 2024, 14430 : 170 - 182
  • [10] Joint Attention Mechanism for Unsupervised Video Object Segmentation
    Yao, Rui
    Xu, Xin
    Zhou, Yong
    Zhao, Jiaqi
    Fang, Liang
    PATTERN RECOGNITION AND COMPUTER VISION, PT I, 2021, 13019 : 154 - 165