Attention-guided Temporally Coherent Video Object Matting

被引:13
|
作者
Zhang, Yunke [1 ]
Wang, Chi [1 ]
Cui, Miaomiao [2 ]
Ren, Peiran [2 ]
Xie, Xuansong [2 ]
Hua, Xian-Sheng [3 ]
Bao, Hujun [1 ]
Huang, Qixing [4 ]
Xu, Weiwei [1 ]
机构
[1] Zhejiang Univ, Hangzhou, Peoples R China
[2] Alibaba Grp, Hangzhou, Peoples R China
[3] Damo Acad, Alibaba Grp, Hangzhou, Peoples R China
[4] Univ Texas Austin, Austin, TX 78712 USA
基金
国家重点研发计划;
关键词
datasets; neural networks; video matting; attention mechanism; INTERACTIVE IMAGE; SEGMENTATION;
D O I
10.1145/3474085.3475623
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a novel deep learning-based video object matting method that can achieve temporally coherent matting results. Its key component is an attention-based temporal aggregation module that maximizes image matting networks ' strength for video matting networks. This module computes temporal correlations for pixels adjacent to each other along the time axis in feature space, which is robust against motion noises. We also design a novel loss term to train the attention weights, which drastically boosts the video matting performance. Besides, we show how to effectively solve the trimap generation problem by fine-tuning a state-of-the-art video object segmentation network with a sparse set of user-annotated keyframes. To facilitate video matting and trimap generation networks ' training, we construct a large-scale video matting dataset with 80 training and 28 validation foreground video clips with ground-truth alpha mattes. Experimental results show that our method can generate high-quality alpha mattes for various videos featuring appearance change, occlusion, and fast motion. Our code and dataset can be found at: https://github.com/yunkezhang/TCVOM
引用
收藏
页码:5128 / 5137
页数:10
相关论文
共 50 条
  • [1] Temporally coherent video matting
    Lee, Sun-Young
    Yoon, Jong-Chul
    Lee, In-Kwon
    GRAPHICAL MODELS, 2010, 72 : 25 - 33
  • [2] Attention-Guided Memory Model for Video Object Segmentation
    Lin, Yunjian
    Tan, Yihua
    Communications in Computer and Information Science, 2022, 1566 CCIS : 67 - 85
  • [3] Attention-guided Adversarial Attack for Video Object Segmentation
    Yao, Rui
    Chen, Ying
    Zhou, Yong
    Hu, Fuyuan
    Zhao, Jiaqi
    Liu, Bing
    Shao, Zhiwen
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2023, 14 (06)
  • [4] Temporally coherent and spatially accurate video matting
    Shahrian, E.
    Price, B.
    Cohen, S.
    Rajan, D.
    COMPUTER GRAPHICS FORUM, 2014, 33 (02) : 381 - 390
  • [5] Towards Temporally-Coherent Video Matting
    Bai, Xue
    Wang, Jue
    Simons, David
    COMPUTER VISION/COMPUTER GRAPHICS COLLABORATION TECHNIQUES, MIRAGE 2011, 2011, 6930 : 63 - 74
  • [6] Video Sparse Transformer With Attention-Guided Memory for Video Object Detection
    Fujitake, Masato
    Sugimoto, Akihiro
    IEEE ACCESS, 2022, 10 : 65886 - 65900
  • [7] Attention-Guided Disentangled Feature Aggregation for Video Object Detection
    Muralidhara, Shishir
    Hashmi, Khurram Azeem
    Pagani, Alain
    Liwicki, Marcus
    Stricker, Didier
    Afzal, Muhammad Zeshan
    SENSORS, 2022, 22 (21)
  • [8] Attention-Guided Network for Semantic Video Segmentation
    Li, Jiangyun
    Zhao, Yikai
    Fu, Jun
    Wu, Jiajia
    Liu, Jing
    IEEE ACCESS, 2019, 7 : 140680 - 140689
  • [9] Object Detection by Attention-Guided Feature Fusion Network
    Shi, Yuxuan
    Fan, Yue
    Xu, Siqi
    Gao, Yue
    Gao, Ran
    SYMMETRY-BASEL, 2022, 14 (05):
  • [10] Attention-guided Feature Fusion for Small Object Detection
    Yang, Jiaxiong
    Liu, Xianhui
    Liu, Zhuang
    IST 2023 - IEEE International Conference on Imaging Systems and Techniques, Proceedings, 2023,