Multi-Stage Feature Fusion Network for Video Super-Resolution

被引:38
|
作者
Song, Huihui [1 ,2 ]
Xu, Wenjie [1 ,2 ]
Liu, Dong [3 ]
Liu, Bo [4 ]
Liu, Qingshan [1 ,2 ]
Metaxas, Dimitris N. [5 ]
机构
[1] Nanjing Univ Informat Sci & Technol, Jiangsu Key Lab Big Data Anal Technol B DAT, Nanjing 210044, Peoples R China
[2] Nanjing Univ Informat Sci & Technol, Jiangsu Collaborat Innovat Ctr Atmospher Environm, Nanjing 210044, Peoples R China
[3] Netflix Inc, Los Gatos, CA 95032 USA
[4] JD Finance Amer Corp, Mountain View, CA 94089 USA
[5] Rutgers State Univ, Dept Comp Sci, Piscataway, NJ 08854 USA
基金
中国国家自然科学基金;
关键词
Visualization; Convolution; Superresolution; Task analysis; Fuses; Feature extraction; Modulation; Video super-resolution; single image super-resolution; deep learning; deformable convolution; feature fusion; QUALITY ASSESSMENT;
D O I
10.1109/TIP.2021.3056868
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video super-resolution (VSR) is to restore a photo-realistic high-resolution (HR) frame from both its corresponding low-resolution (LR) frame (reference frame) and multiple neighboring frames (supporting frames). An important step in VSR is to fuse the feature of the reference frame with the features of the supporting frames. The major issue with existing VSR methods is that the fusion is conducted in a one-stage manner, and the fused feature may deviate greatly from the visual information in the original LR reference frame. In this paper, we propose an end-to-end Multi-Stage Feature Fusion Network that fuses the temporally aligned features of the supporting frames and the spatial feature of the original reference frame at different stages of a feed-forward neural network architecture. In our network, the Temporal Alignment Branch is designed as an inter-frame temporal alignment module used to mitigate the misalignment between the supporting frames and the reference frame. Specifically, we apply the multi-scale dilated deformable convolution as the basic operation to generate temporally aligned features of the supporting frames. Afterwards, the Modulative Feature Fusion Branch, the other branch of our network accepts the temporally aligned feature map as a conditional input and modulates the feature of the reference frame at different stages of the branch backbone. This enables the feature of the reference frame to be referenced at each stage of the feature fusion process, leading to an enhanced feature from LR to HR. Experimental results on several benchmark datasets demonstrate that our proposed method can achieve state-of-the-art performance on VSR task.
引用
收藏
页码:2923 / 2934
页数:12
相关论文
共 50 条
  • [1] MULTI-STAGE FEATURE ALIGNMENT NETWORK FOR VIDEO SUPER-RESOLUTION
    Suzuki, Keito
    Ikehara, Masaaki
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 2001 - 2005
  • [2] RECURSIVE MULTI-STAGE UPSCALING NETWORK WITH DISCRIMINATIVE FUSION FOR SUPER-RESOLUTION
    Lu, Yue
    Jiang, Zhuqing
    Ju, Guodong
    Shen, Liangheng
    Men, Aidong
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 574 - 579
  • [3] A multi-stage spatio-temporal adaptive network for video super-resolution
    Zhang, Yuhang
    Chen, Zhenzhong
    Liu, Shan
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 87
  • [4] Deep Feature Fusion Network for Compressed Video Super-Resolution
    Yue Wang
    Xiaohong Wu
    Xiaohai He
    Chao Ren
    Tingrong Zhang
    [J]. Neural Processing Letters, 2022, 54 : 4427 - 4441
  • [5] Deep Feature Fusion Network for Compressed Video Super-Resolution
    Wang, Yue
    Wu, Xiaohong
    He, Xiaohai
    Ren, Chao
    Zhang, Tingrong
    [J]. NEURAL PROCESSING LETTERS, 2022, 54 (05) : 4427 - 4441
  • [6] Multi-branch-feature fusion super-resolution network
    Li, Dong
    Yang, Silu
    Wang, Xiaoming
    Qin, Yu
    Zhang, Heng
    [J]. DIGITAL SIGNAL PROCESSING, 2024, 145
  • [7] MultiBoot VSR: Multi-Stage Multi-Reference Bootstrapping for Video Super-Resolution
    Kalarot, Ratheesh
    Porikli, Fatih
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, : 2060 - 2069
  • [8] Multi-Stage Edge-Guided Stereo Feature Interaction Network for Stereoscopic Image Super-Resolution
    Wan, Jin
    Yin, Hui
    Liu, Zhihao
    Liu, Yanting
    Wang, Song
    [J]. IEEE TRANSACTIONS ON BROADCASTING, 2023, 69 (02) : 357 - 368
  • [9] Multi-stage frame alignment video super- resolution network
    Wang, Sen
    Zhu, Yang
    Zhang, Yinhui
    Wang, Qingjian
    He, Zifen
    [J]. Guangxue Jingmi Gongcheng/Optics and Precision Engineering, 2023, 31 (16): : 2430 - 2443
  • [10] MS2Net: Multi-Scale and Multi-Stage Feature Fusion for Blurred Image Super-Resolution
    Niu, Axi
    Zhu, Yu
    Zhang, Chaoning
    Sun, Jinqiu
    Wang, Pei
    Kweon, In So
    Zhang, Yanning
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (08) : 5137 - 5150