EFFICIENT CONVOLUTION AND TRANSFORMER-BASED NETWORK FOR VIDEO FRAME INTERPOLATION

被引：1

作者：

Khalifeh, Issa ^{[1
,2
]}

Murn, Luka ^{[1
]}

Mrak, Marta ^{[2
]}

Izquierdo, Ebroul ^{[2
]}

机构：

[1] BBC Res & Dev, London, England

[2] Queen Mary Univ London, London, England

来源：

2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP | 2023年

基金：

英国工程与自然科学研究理事会;

关键词：

Video frame interpolation; transformer; complexity reduction; dual-encoder;

D O I：

10.1109/ICIP49359.2023.10222296

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Video frame interpolation is an increasingly important research task with several key industrial applications in the video coding, broadcast and production sectors. Recently, transformers have been introduced to the field resulting in substantial performance gains. However, this comes at a cost of greatly increased memory usage, training and inference time. In this paper, a novel method integrating a transformer encoder and convolutional features is proposed. This network reduces the memory burden by close to 50% and runs up to four times faster during inference time compared to existing transformer-based interpolation methods. A dual-encoder architecture is introduced which combines the strength of convolutions in modelling local correlations with those of the transformer for long-range dependencies. Quantitative evaluations are conducted on various benchmarks with complex motion to showcase the robustness of the proposed method, achieving competitive performance compared to state-of-the-art interpolation networks.

引用

页码：1050 / 1054

页数：5

共 50 条

[1] Video Frame Interpolation Transformer
Shi, Zhihao
Xu, Xiangyu
Liu, Xiaohong
Chen, Jun
Yang, Ming-Hsuan
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 17461 - 17470
[2] Video Frame Interpolation with Transformer
Lu, Liying
Wu, Ruizheng
Lin, Huaijia
Lu, Jiangbo
Jia, Jiaya
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 3522 - 3532
[3] Video Frame Interpolation with Flow Transformer
Gao, Pan
Tian, Haoyue
Qin, Jie
[J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 1933 - 1942
[4] Video Frame Interpolation via Adaptive Convolution
Niklaus, Simon
Mai, Long
Liu, Feng
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2270 - 2279
[5] Video Frame Interpolation via Generalized Deformable Convolution
Shi, Zhihao
Liu, Xiaohong
Shi, Kangdi
Dai, Linhui
Chen, Jun
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 426 - 439
[6] Video Frame Interpolation via Deformable Separable Convolution
Cheng, Xianhang
Chen, Zhenzhong
[J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 10607 - 10614
[7] Video Frame Interpolation via Adaptive Separable Convolution
Niklaus, Simon
Mai, Long
Liu, Feng
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 261 - 270
[8] Progressive Motion Context Refine Network for Efficient Video Frame Interpolation
Kong, Lingtong
Liu, Jinfeng
Yang, Jie
[J]. IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2338 - 2342
[9] Transformer-Based Spatiotemporal Graph Diffusion Convolution Network for Traffic Flow Forecasting
Wei, Siwei
Yang, Yang
Liu, Donghua
Deng, Ke
Wang, Chunzhi
[J]. ELECTRONICS, 2024, 13 (16)
[10] Transformer-based Cross Reference Network for video salient object detection
Huang, Kan
Tian, Chunwei
Su, Jingyong
Lin, Jerry Chun-Wei
[J]. PATTERN RECOGNITION LETTERS, 2022, 160 : 122 - 127

← 1 2 3 4 5 →