STD-Net: Spatio-Temporal Decomposition Network for Video Demoiring With Sparse Transformers

被引:0
|
作者
Niu, Yuzhen [1 ,2 ]
Xu, Rui [1 ,2 ]
Lin, Zhihua [3 ]
Liu, Wenxi [1 ,2 ]
机构
[1] Fuzhou Univ, Coll Comp & Data Sci, Fujian Key Lab Network Comp & Intelligent Informa, Fuzhou 350108, Peoples R China
[2] Minist Educ, Engn Res Ctr Bigdata Intelligence, Fuzhou 350108, Peoples R China
[3] Res Inst Alipay Informat Technol Co Ltd, Hangzhou 310000, Peoples R China
基金
中国国家自然科学基金;
关键词
Image restoration; video demoireing; video restoration; spatio-temporal network; sparse transformer; QUALITY ASSESSMENT; IMAGE;
D O I
10.1109/TCSVT.2024.3386604
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The problem of video demoireing is a new challenge in video restoration. Unlike image demoireing, which involves removing static and uniform patterns, video demoireing requires tackling dynamic and varied moire patterns while maintaining video details, colors, and temporal consistency. It is particularly challenging to model moire patterns for videos with camera or object motions, where separating moire from the original video content across frames is extremely difficult. Nonetheless, we observe that the spatial distribution of moire patterns is often sparse on each frame, and their long-range temporal correlation is not significant. To fully leverage this phenomenon, a sparsity-constrained spatial self-attention scheme is proposed to concentrate on removing sparse moire efficiently for each frame without being distracted by dynamic video content. The frame-wise spatial features are then correlated and aggregated via the local temporal cross-frame-attention module to produce temporal-consistent high-quality moire-free videos. The above decoupled spatial and temporal transformers constitute the Spatio-Temporal Decomposition Network, dubbed STD-Net. For evaluation, we present a large-scale video demoireing benchmark featuring various real-life scenes, camera motions, and object motions. We demonstrate that our proposed model can effectively and efficiently achieve superior performance on video demoireing and single image demoireing tasks. The proposed dataset is released at https://github.com/FZU-N/LVDM.
引用
收藏
页码:8562 / 8575
页数:14
相关论文
共 50 条
  • [21] Blur-aware Spatio-temporal Sparse Transformer for Video Deblurring
    Zhang, Huicong
    Xie, Haozhe
    Yao, Hongxun
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 2673 - 2681
  • [22] Scalable spatio-temporal smoothing via hierarchical sparse Cholesky decomposition
    Jurek, Marcin
    Katzfuss, Matthias
    ENVIRONMETRICS, 2023, 34 (01)
  • [23] Spatio-temporal Sampling for Video
    Shankar, Mohan
    Pitsiauis, Nikos P.
    Brady, David
    IMAGE RECONSTRUCTION FROM INCOMPLETE DATA V, 2008, 7076
  • [24] Spatio-Temporal Koopman Decomposition
    Soledad Le Clainche
    José M. Vega
    Journal of Nonlinear Science, 2018, 28 : 1793 - 1842
  • [25] Spatio-Temporal Koopman Decomposition
    Le Clainche, Soledad
    Vega, Jose M.
    JOURNAL OF NONLINEAR SCIENCE, 2018, 28 (05) : 1793 - 1842
  • [26] Hybrid coding of video with spatio-temporal scalability using subband decomposition
    Domanski, M
    Luczak, A
    Mackowiak, S
    Swierczynski, R
    VISUAL COMMUNICATIONS AND IMAGE PROCESSING '99, PARTS 1-2, 1998, 3653 : 1018 - 1025
  • [27] Spatio-temporal prediction and reconstruction network for video anomaly detection
    Liu, Ting
    Zhang, Chengqing
    Niu, Xiaodong
    Wang, Liming
    PLOS ONE, 2022, 17 (05):
  • [28] A spatio-temporal network for video semantic segmentation in surgical videos
    Maria Grammatikopoulou
    Ricardo Sanchez-Matilla
    Felix Bragman
    David Owen
    Lucy Culshaw
    Karen Kerr
    Danail Stoyanov
    Imanol Luengo
    International Journal of Computer Assisted Radiology and Surgery, 2024, 19 : 375 - 382
  • [29] A spatio-temporal network for video semantic segmentation in surgical videos
    Grammatikopoulou, Maria
    Sanchez-Matilla, Ricardo
    Bragman, Felix
    Owen, David
    Culshaw, Lucy
    Kerr, Karen
    Stoyanov, Danail
    Luengo, Imanol
    INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2023, 19 (2) : 375 - 382
  • [30] A spatio-temporal network for video semantic segmentation in surgical videos
    Grammatikopoulou, Maria
    Sanchez-Matilla, Ricardo
    Bragman, Felix
    Owen, David
    Culshaw, Lucy
    Kerr, Karen
    Stoyanov, Danail
    Luengo, Imanol
    arXiv, 2023,