An Efficient Multi-Scale Attention Feature Fusion Network for 4K Video Frame Interpolation

被引：1

作者：

Ning, Xin ^{[1
]}

Li, Yuhang ^{[1
]}

Feng, Ziwei ^{[1
]}

Liu, Jinhua ^{[1
]}

Ding, Youdong ^{[1
,2
]}

机构：

[1] Shanghai Univ, Coll Shanghai Film, 788 Guangzhong Rd, Shanghai 200072, Peoples R China

[2] Shanghai Engn Res Ctr Mot Picture Special Effects, 788 Guangzhong Rd, Shanghai 200072, Peoples R China

来源：

ELECTRONICS | 2024年 / 13卷 / 06期

基金：

中国国家自然科学基金;

关键词：

4K video frame interpolation; 4K video dataset; self-attention; multi-scale; high frame rate;

D O I：

10.3390/electronics13061037

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Video frame interpolation aims to generate intermediate frames in a video to showcase finer details. However, most methods are only trained and tested on low-resolution datasets, lacking research on 4K video frame interpolation problems. This limitation makes it challenging to handle high-frame-rate video processing in real-world scenarios. In this paper, we propose a 4K video dataset at 120 fps, named UHD4K120FPS, which contains large motion. We also propose a novel framework for solving the 4K video frame interpolation task, based on a multi-scale pyramid network structure. We introduce self-attention to capture long-range dependencies and self-similarities in pixel space, which overcomes the limitations of convolutional operations. To reduce computational cost, we use a simple mapping-based approach to lighten self-attention, while still allowing for content-aware aggregation weights. Through extensive quantitative and qualitative experiments, we demonstrate the excellent performance achieved by our proposed model on the UHD4K120FPS dataset, as well as illustrate the effectiveness of our method for 4K video frame interpolation. In addition, we evaluate the robustness of the model on low-resolution benchmark datasets.

引用

页数：16

共 50 条

[1] A Fast 4K Video Frame Interpolation Using a Multi-Scale Optical Flow Reconstruction Network
Ahn, Ha-Eun
Jeong, Jinwoo
Kim, Je Woo
Kwon, Soonchul
Yoo, Jisang
SYMMETRY-BASEL, 2019, 11 (10):
[2] A Multi-Scale Position Feature Transform Network for Video Frame Interpolation
Cheng, Xianhang
Chen, Zhenzhong
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (11) : 3968 - 3981
[3] Multi-Scale Attention Generative Adversarial Networks for Video Frame Interpolation
Xiao, Jian
Bi, Xiaojun
IEEE ACCESS, 2020, 8 : 94842 - 94851
[4] EMCFN: Edge-based Multi-scale Cross Fusion Network for video frame interpolation
Wang, Shaowen
Yang, Xiaohui
Feng, Zhiquan
Sun, Jiande
Liu, Ju
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 103
[5] Multi-Scale Warping for Video Frame Interpolation
Choi, Whan
Koh, Yeong Jun
Kim, Chang-Su
IEEE ACCESS, 2021, 9 : 150470 - 150479
[6] MFANet: Multi-scale feature fusion network with attention mechanism
Wang, Gaihua
Gan, Xin
Cao, Qingcheng
Zhai, Qianyu
VISUAL COMPUTER, 2023, 39 (07): : 2969 - 2980
[7] MFANet: Multi-scale feature fusion network with attention mechanism
Gaihua Wang
Xin Gan
Qingcheng Cao
Qianyu Zhai
The Visual Computer, 2023, 39 : 2969 - 2980
[8] Pyramid attention object detection network with multi-scale feature fusion
Chen, Xiu
Li, Yujie
Nakatoh, Yoshihisa
COMPUTERS & ELECTRICAL ENGINEERING, 2022, 104
[9] Multi-scale feature fusion network with local attention for lung segmentation
Xie, Yinghua
Zhou, Yuntong
Wang, Chen
Ma, Yanshan
Yang, Ming
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 119
[10] Multi-Scale Feature Fusion Network with Attention for Single Image Dehazing
Hu, Bin
PATTERN RECOGNITION AND IMAGE ANALYSIS, 2021, 31 (04) : 608 - 615

← 1 2 3 4 5 →