TempFormer: Temporally Consistent Transformer for Video Denoising

被引:8
|
作者
Song, Mingyang [1 ,2 ]
Zhang, Yang [2 ]
Aydin, Tunc O. [2 ]
机构
[1] Swiss Fed Inst Technol, Zurich, Switzerland
[2] DisneyRes Studios, Zurich, Switzerland
来源
关键词
Video denoising; Transformer; Temporal consistency;
D O I
10.1007/978-3-031-19800-7_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video denoising is a low-level vision task that aims to restore high quality videos from noisy content. Vision Transformer (ViT) is a new machine learning architecture that has shown promising performance on both high-level and low-level image tasks. In this paper, we propose a modified ViT architecture for video processing tasks, introducing a new training strategy and loss function to enhance temporal consistency without compromising spatial quality. Specifically, we propose an efficient hybrid Transformer-based model, TempFormer, which composes Spatio-Temporal Transformer Blocks (STTB) and 3D convolutional layers. The proposed STTB learns the temporal information between neighboring frames implicitly by utilizing the proposed Joint Spatio-Temporal Mixer module for attention calculation and feature aggregation in each ViT block. Moreover, existing methods suffer from temporal inconsistency artifacts that are problematic in practical cases and distracting to the viewers. We propose a sliding block strategy with recurrent architecture, and use a new loss term, Overlap Loss, to alleviate the flickering between adjacent frames. Our method produces state-of-the-art spatio-temporal denoising quality with significantly improved temporal coherency, and requires less computational resources to achieve comparable denoising quality with competing methods (Fig. 1).
引用
下载
收藏
页码:481 / 496
页数:16
相关论文
共 50 条
  • [1] DeepEnhancer: Temporally Consistent Focal Transformer for Comprehensive Video Enhancement
    Jiang, Qin
    Wang, Qinglin
    Chi, Lihua
    Ma, Wentao
    Li, Feng
    Liu, Jie
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 969 - 977
  • [2] Temporally Consistent Semantic Video Editing
    Xu, Yiran
    AlBahar, Badour
    Huang, Jia-Bin
    COMPUTER VISION - ECCV 2022, PT XV, 2022, 13675 : 357 - 374
  • [3] Complete and temporally consistent video outpainting
    Dehan, Loic
    Van Ranst, Wiebe
    Vandewalle, Patrick
    Goedeme, Toon
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 686 - 694
  • [4] A motional but temporally consistent physical video examples
    Du, Zhenyu
    Wei, Xingxing
    Zhang, Weiming
    Liu, Fangzheng
    Bian, Huanyu
    Liu, Jiayang
    JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2022, 69
  • [5] A motional but temporally consistent physical video examples
    Du, Zhenyu
    Wei, Xingxing
    Zhang, Weiming
    Liu, Fangzheng
    Bian, Huanyu
    Liu, Jiayang
    Journal of Information Security and Applications, 2022, 69
  • [6] VIDEO STEREO MATCHING WITH TEMPORALLY CONSISTENT BELIEF PROPAGATION
    Hou, Hsin-Yu
    Wu, Sih-Sian
    Chang, Da-Fang
    Chen, Liang-Gee
    2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2018,
  • [7] TEMPORALLY CONSISTENT VIDEO MATTING BASED ON BILAYER SEGMENTATION
    Tang, Zhen
    Miao, Zhenjiang
    Wan, Yanli
    2010 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2010), 2010, : 370 - 375
  • [8] Copy and Paste: Temporally Consistent Stereoscopic Video Blending
    Wang, Zongji
    Chen, Xiaowu
    Zou, Dongqing
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (10) : 3053 - 3065
  • [9] TCOVIS: Temporally Consistent Online Video Instance Segmentation
    Li, Junlong
    Yu, Bingyao
    Rao, Yongming
    Zhou, Jie
    Lu, Jiwen
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 1097 - 1107
  • [10] Temporally Efficient Vision Transformer for Video Instance Segmentation
    Yang, Shusheng
    Wang, Xinggang
    Li, Yu
    Fang, Yuxin
    Fang, Jiemin
    Liu, Wenyu
    Zhao, Xun
    Shan, Ying
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 2875 - 2885