Video super-resolution via mixed spatial-temporal convolution and selective fusion

被引:11
|
作者
Sun, Wei [1 ]
Gong, Dong [2 ]
Shi, Javen Qinfeng [3 ]
van den Hengel, Anton [3 ]
Zhang, Yanning [4 ]
机构
[1] Xian Univ Posts & Telecommun, Sch Comp Sci & Technol, Xian, Peoples R China
[2] Univ New South Wales, Sch Comp Sci & Engn, Sydney, NSW, Australia
[3] Univ Adelaide, Australian Inst Machine Learning, Adelaide, SA, Australia
[4] Northwestern Polytech Univ, Sch Comp Sci & Engn, Xian, Peoples R China
关键词
Video super-Resolution; Mixed spatial-Temporal convolution; Selective feature fusion; NETWORK;
D O I
10.1016/j.patcog.2022.108577
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video super-resolution aims to recover the high-resolution (HR) contents from the low-resolution (LR) observations relying on compositing the spatial-temporal information in the LR frames. It is crucial to model the spatial-temporal information jointly since the video sequences are three-dimensional spatial temporal signals. Compared with explicitly estimating motions between the 2D frames, 3D convolutional neural networks (CNNs) have been shown its efficiency and effectiveness for video super-resolution (SR), as a natural way of spatial-temporal data modelling. Though promising, the performance of 3D CNNs is still far from satisfactory. The high computational and memory requirements limit the development of more advanced designs to extract and fuse the information from a larger spatial and temporal scale. We thus propose a Mixed Spatial-Temporal Convolution (MSTC) block that simultaneously extracts the spatial information and the supplemented temporal dependency among frames by jointly applying 2D and 3D convolution. To further fuse the learned features corresponding to different frames, we propose a novel similarity-based selective features strategy, unlike precious methods directly stacking the learned features. Additionally, an attention-based motion compensation module is applied to alleviate the influence of misalignment between frames. Experiments on three widely used benchmark datasets and real-world dataset show that, relying on superior feature extraction and fusion ability, the proposed network can outperform previous state-of-the-art methods, especially for recovering the confusing details. (c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Conditional Neural Video Coding with Spatial-Temporal Super-Resolution
    Wang, Henan
    Pan, Xiaohan
    Feng, Runsen
    Guo, Zongyu
    Chen, Zhibo
    [J]. 2024 DATA COMPRESSION CONFERENCE, DCC, 2024, : 591 - 591
  • [2] Learning a spatial-temporal symmetry network for video super-resolution
    Xiaohang Wang
    Mingliang Liu
    Pengying Wei
    [J]. Applied Intelligence, 2023, 53 : 3530 - 3544
  • [3] CTVSR: Collaborative Spatial-Temporal Transformer for Video Super-Resolution
    Tang, Jun
    Lu, Chenyan
    Liu, Zhengxue
    Li, Jiale
    Dai, Hang
    Ding, Yong
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (06) : 5018 - 5032
  • [4] Deformable Spatial-Temporal Attention for Lightweight Video Super-Resolution
    Xue, Tong
    Huang, Xinyi
    Li, Dengshi
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT X, 2024, 14434 : 482 - 493
  • [5] Learning a spatial-temporal symmetry network for video super-resolution
    Wang, Xiaohang
    Liu, Mingliang
    Wei, Pengying
    [J]. APPLIED INTELLIGENCE, 2023, 53 (03) : 3530 - 3544
  • [6] Video super-resolution based on spatial-temporal recurrent residual networks
    Yang, Wenhan
    Feng, Jiashi
    Xie, Guosen
    Liu, Jiaying
    Guo, Zongming
    Yan, Shuicheng
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2018, 168 : 79 - 92
  • [7] Spatial-Temporal Space Hand-in-Hand: Spatial-Temporal Video Super-Resolution via Cycle-Projected Mutual Learning
    Hu, Mengshun
    Jiang, Kui
    Liao, Liang
    Xiao, Jing
    Jiang, Junjun
    Wang, Zheng
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 3564 - 3573
  • [8] Fine-grained video super-resolution via spatial-temporal learning and image detail enhancement
    Yeh, Chia -Hung
    Yang, Hsin-Fu
    Lin, Yu -Yang
    Huang, Wan-Jen
    Tsai, Feng-Hsu
    Kang, Li - Wei
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 131
  • [9] Fine-grained video super-resolution via spatial-temporal learning and image detail enhancement
    Yeh, Chia-Hung
    Yang, Hsin-Fu
    Lin, Yu-Yang
    Huang, Wan-Jen
    Tsai, Feng-Hsu
    Kang, Li-Wei
    [J]. Engineering Applications of Artificial Intelligence, 2024, 131
  • [10] Towards High-Quality and Efficient Video Super-Resolution via Spatial-Temporal Data Overfitting
    Li, Gen
    Ji, Jie
    Qin, Minghai
    Niu, Wei
    Ren, Bin
    Afghah, Fatemeh
    Guo, Linke
    Ma, Xiaolong
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 10259 - 10269