Attention-guided Temporally Coherent Video Object Matting

被引:13
|
作者
Zhang, Yunke [1 ]
Wang, Chi [1 ]
Cui, Miaomiao [2 ]
Ren, Peiran [2 ]
Xie, Xuansong [2 ]
Hua, Xian-Sheng [3 ]
Bao, Hujun [1 ]
Huang, Qixing [4 ]
Xu, Weiwei [1 ]
机构
[1] Zhejiang Univ, Hangzhou, Peoples R China
[2] Alibaba Grp, Hangzhou, Peoples R China
[3] Damo Acad, Alibaba Grp, Hangzhou, Peoples R China
[4] Univ Texas Austin, Austin, TX 78712 USA
基金
国家重点研发计划;
关键词
datasets; neural networks; video matting; attention mechanism; INTERACTIVE IMAGE; SEGMENTATION;
D O I
10.1145/3474085.3475623
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a novel deep learning-based video object matting method that can achieve temporally coherent matting results. Its key component is an attention-based temporal aggregation module that maximizes image matting networks ' strength for video matting networks. This module computes temporal correlations for pixels adjacent to each other along the time axis in feature space, which is robust against motion noises. We also design a novel loss term to train the attention weights, which drastically boosts the video matting performance. Besides, we show how to effectively solve the trimap generation problem by fine-tuning a state-of-the-art video object segmentation network with a sparse set of user-annotated keyframes. To facilitate video matting and trimap generation networks ' training, we construct a large-scale video matting dataset with 80 training and 28 validation foreground video clips with ground-truth alpha mattes. Experimental results show that our method can generate high-quality alpha mattes for various videos featuring appearance change, occlusion, and fast motion. Our code and dataset can be found at: https://github.com/yunkezhang/TCVOM
引用
收藏
页码:5128 / 5137
页数:10
相关论文
共 50 条
  • [31] Object-level change detection with a dual correlation attention-guided detector
    Zhang, Lin
    Hu, Xiangyun
    Zhang, Mi
    Shu, Zhen
    Zhou, Hao
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2021, 177 : 147 - 160
  • [32] Video Compression Artifacts Removal With Spatial-Temporal Attention-Guided Enhancement
    Jiang, Nanfeng
    Chen, Weiling
    Lin, Jielian
    Zhao, Tiesong
    Lin, Chia-Wen
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 5657 - 5669
  • [33] Motion Guided Attention for Video Salient Object Detection
    Li, Haofeng
    Chen, Guanqi
    Li, Guanbin
    Yu, Yizhou
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 7273 - 7282
  • [34] Guided Slot Attention for Unsupervised Video Object Segmentation
    Lee, Minhyeok
    Cho, Suhwan
    Lee, Dogyoon
    Park, Chaewon
    Lee, Jungho
    Lee, Sangyoun
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 3807 - 3816
  • [35] Attention-guided CNN for image denoising
    Tian, Chunwei
    Xu, Yong
    Li, Zuoyong
    Zuo, Wangmeng
    Fei, Lunke
    Liu, Hong
    NEURAL NETWORKS, 2020, 124 : 117 - 129
  • [36] Automatic video matting based on hybrid video object segmentation and closed-form matting
    Hu, Wu-Chih
    Hsu, Jung-Fu
    Huang, Deng-Yuan
    JOURNAL OF ELECTRONIC IMAGING, 2013, 22 (02)
  • [37] AG-YOLO: Attention-guided network for real-time object detection
    Hangyu Zhu
    Libo Sun
    Wenhu Qin
    Feng Tian
    Multimedia Tools and Applications, 2024, 83 : 28197 - 28213
  • [38] AG-YOLO: Attention-guided network for real-time object detection
    Zhu, Hangyu
    Sun, Libo
    Qin, Wenhu
    Tian, Feng
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (09) : 28197 - 28213
  • [39] Recurrent Network with Enhanced Alignment and Attention-Guided Aggregation for Compressed Video Quality Enhancement
    Shi, Xiaodi
    Lin, Jucai
    Jiang, Dong
    Nian, Chunmei
    Yin, Jun
    2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2022,
  • [40] Temporally coherent person matting trained on fake-motion dataset
    Molodetskikh, Ivan
    Erofeev, Mikhail
    Moskalenko, Andrey
    Vatolin, Dmitry
    DIGITAL SIGNAL PROCESSING, 2022, 126