An Efficient Multi-Scale Attention Feature Fusion Network for 4K Video Frame Interpolation

被引:1
|
作者
Ning, Xin [1 ]
Li, Yuhang [1 ]
Feng, Ziwei [1 ]
Liu, Jinhua [1 ]
Ding, Youdong [1 ,2 ]
机构
[1] Shanghai Univ, Coll Shanghai Film, 788 Guangzhong Rd, Shanghai 200072, Peoples R China
[2] Shanghai Engn Res Ctr Mot Picture Special Effects, 788 Guangzhong Rd, Shanghai 200072, Peoples R China
基金
中国国家自然科学基金;
关键词
4K video frame interpolation; 4K video dataset; self-attention; multi-scale; high frame rate;
D O I
10.3390/electronics13061037
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Video frame interpolation aims to generate intermediate frames in a video to showcase finer details. However, most methods are only trained and tested on low-resolution datasets, lacking research on 4K video frame interpolation problems. This limitation makes it challenging to handle high-frame-rate video processing in real-world scenarios. In this paper, we propose a 4K video dataset at 120 fps, named UHD4K120FPS, which contains large motion. We also propose a novel framework for solving the 4K video frame interpolation task, based on a multi-scale pyramid network structure. We introduce self-attention to capture long-range dependencies and self-similarities in pixel space, which overcomes the limitations of convolutional operations. To reduce computational cost, we use a simple mapping-based approach to lighten self-attention, while still allowing for content-aware aggregation weights. Through extensive quantitative and qualitative experiments, we demonstrate the excellent performance achieved by our proposed model on the UHD4K120FPS dataset, as well as illustrate the effectiveness of our method for 4K video frame interpolation. In addition, we evaluate the robustness of the model on low-resolution benchmark datasets.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] A Fast 4K Video Frame Interpolation Using a Multi-Scale Optical Flow Reconstruction Network
    Ahn, Ha-Eun
    Jeong, Jinwoo
    Kim, Je Woo
    Kwon, Soonchul
    Yoo, Jisang
    SYMMETRY-BASEL, 2019, 11 (10):
  • [2] A Multi-Scale Position Feature Transform Network for Video Frame Interpolation
    Cheng, Xianhang
    Chen, Zhenzhong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (11) : 3968 - 3981
  • [3] Multi-Scale Attention Generative Adversarial Networks for Video Frame Interpolation
    Xiao, Jian
    Bi, Xiaojun
    IEEE ACCESS, 2020, 8 : 94842 - 94851
  • [4] EMCFN: Edge-based Multi-scale Cross Fusion Network for video frame interpolation
    Wang, Shaowen
    Yang, Xiaohui
    Feng, Zhiquan
    Sun, Jiande
    Liu, Ju
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 103
  • [5] Multi-Scale Warping for Video Frame Interpolation
    Choi, Whan
    Koh, Yeong Jun
    Kim, Chang-Su
    IEEE ACCESS, 2021, 9 : 150470 - 150479
  • [6] MFANet: Multi-scale feature fusion network with attention mechanism
    Wang, Gaihua
    Gan, Xin
    Cao, Qingcheng
    Zhai, Qianyu
    VISUAL COMPUTER, 2023, 39 (07): : 2969 - 2980
  • [7] MFANet: Multi-scale feature fusion network with attention mechanism
    Gaihua Wang
    Xin Gan
    Qingcheng Cao
    Qianyu Zhai
    The Visual Computer, 2023, 39 : 2969 - 2980
  • [8] Pyramid attention object detection network with multi-scale feature fusion
    Chen, Xiu
    Li, Yujie
    Nakatoh, Yoshihisa
    COMPUTERS & ELECTRICAL ENGINEERING, 2022, 104
  • [9] Multi-scale feature fusion network with local attention for lung segmentation
    Xie, Yinghua
    Zhou, Yuntong
    Wang, Chen
    Ma, Yanshan
    Yang, Ming
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 119
  • [10] Multi-Scale Feature Fusion Network with Attention for Single Image Dehazing
    Hu, Bin
    PATTERN RECOGNITION AND IMAGE ANALYSIS, 2021, 31 (04) : 608 - 615