Learning a spatial-temporal symmetry network for video super-resolution

被引:0
|
作者
Wang, Xiaohang [1 ,2 ]
Liu, Mingliang [1 ,2 ]
Wei, Pengying [1 ,2 ]
机构
[1] Heilongjiang Univ, Dept Automat, Harbin 150080, Heilongjiang, Peoples R China
[2] Heilongjiang Univ, Key Lab Informat Fus Estimat & Detect, Harbin 150080, Heilongjiang, Peoples R China
关键词
Video super-resolution; Motion estimation; Spatial-temporal symmetry; Convolutional neural network; CONVOLUTION;
D O I
10.1007/s10489-022-03603-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The video super-resolution (VSR) method is designed to estimate and restore high-resolution (HR) sequences from low-resolution (LR) input. For the past few years, many VSR methods with machine learning have been proposed that combine both the convolutional neural network (CNN) and motion compensation. Most mainstream approaches are based on optical flow or deformation convolution, and both need accurate estimates for motion compensation. However, most previous methods have not been able to fully utilize the spatial-temporal symmetrical information from input sequences. Moreover, much computation is consumed by aligning every neighbouring frame to the reference frame separately. Furthermore, many methods reconstruct HR results on only a single scale, which limits the reconstruction accuracy of the network and its performance in complex scenes. In this study, we propose a spatial-temporal symmetry network (STSN) to solve the above deficiencies. STSN includes four parts: prefusion, alignment, postfusion and reconstruction. First, a two-stage fusion strategy is applied to reduce the computation consumption of the network. Furthermore, ConvGRU is utilized in the prefusion module, the redundant features between neighbouring frames are eliminated, and several neighbouring frames are fused and condensed into two parts. To generate accurate offset maps, we present a spatial-temporal symmetry attention block (STSAB). This component exploits the symmetry of spatial-temporal combined spatial attention. In the reconstruction module, we propose an SR multiscale residual block (SR-MSRB) to enhance reconstruction performance. Abundant experimental results that test several datasets show that our method possesses better effects and efficiency in both quantitative and qualitative measurement indices compared with state-of-the-art methods.
引用
收藏
页码:3530 / 3544
页数:15
相关论文
共 50 条
  • [1] Learning a spatial-temporal symmetry network for video super-resolution
    Xiaohang Wang
    Mingliang Liu
    Pengying Wei
    [J]. Applied Intelligence, 2023, 53 : 3530 - 3544
  • [2] Conditional Neural Video Coding with Spatial-Temporal Super-Resolution
    Wang, Henan
    Pan, Xiaohan
    Feng, Runsen
    Guo, Zongyu
    Chen, Zhibo
    [J]. 2024 DATA COMPRESSION CONFERENCE, DCC, 2024, : 591 - 591
  • [3] CTVSR: Collaborative Spatial-Temporal Transformer for Video Super-Resolution
    Tang, Jun
    Lu, Chenyan
    Liu, Zhengxue
    Li, Jiale
    Dai, Hang
    Ding, Yong
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (06) : 5018 - 5032
  • [4] Deformable Spatial-Temporal Attention for Lightweight Video Super-Resolution
    Xue, Tong
    Huang, Xinyi
    Li, Dengshi
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT X, 2024, 14434 : 482 - 493
  • [5] Building an End-to-End Spatial-Temporal Convolutional Network for Video Super-Resolution
    Guo, Jun
    Chao, Hongyang
    [J]. THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4053 - 4060
  • [6] Video super-resolution based on spatial-temporal recurrent residual networks
    Yang, Wenhan
    Feng, Jiashi
    Xie, Guosen
    Liu, Jiaying
    Guo, Zongming
    Yan, Shuicheng
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2018, 168 : 79 - 92
  • [7] Nonlocal-guided enhanced interaction spatial-temporal network for compressed video super-resolution
    Junxiong Cheng
    Shuhua Xiong
    Xiaohai He
    Chao Ren
    Tingrong Zhang
    Honggang Chen
    [J]. Applied Intelligence, 2023, 53 : 24407 - 24421
  • [8] Nonlocal-guided enhanced interaction spatial-temporal network for compressed video super-resolution
    Cheng, Junxiong
    Xiong, Shuhua
    He, Xiaohai
    Ren, Chao
    Zhang, Tingrong
    Chen, Honggang
    [J]. APPLIED INTELLIGENCE, 2023, 53 (20) : 24407 - 24421
  • [9] CycMuNet plus : Cycle-Projected Mutual Learning for Spatial-Temporal Video Super-Resolution
    Hu, Mengshun
    Jiang, Kui
    Wang, Zheng
    Bai, Xiang
    Hu, Ruimin
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (11) : 13376 - 13392
  • [10] Learning Spatial-Temporal Implicit Neural Representations for Event-Guided Video Super-Resolution
    Lu, Yunfan
    Wang, Zipeng
    Liu, Minjie
    Wang, Hongjian
    Wang, Lin
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 1557 - 1567