EFFICIENT CONVOLUTION AND TRANSFORMER-BASED NETWORK FOR VIDEO FRAME INTERPOLATION

被引:1
|
作者
Khalifeh, Issa [1 ,2 ]
Murn, Luka [1 ]
Mrak, Marta [2 ]
Izquierdo, Ebroul [2 ]
机构
[1] BBC Res & Dev, London, England
[2] Queen Mary Univ London, London, England
基金
英国工程与自然科学研究理事会;
关键词
Video frame interpolation; transformer; complexity reduction; dual-encoder;
D O I
10.1109/ICIP49359.2023.10222296
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video frame interpolation is an increasingly important research task with several key industrial applications in the video coding, broadcast and production sectors. Recently, transformers have been introduced to the field resulting in substantial performance gains. However, this comes at a cost of greatly increased memory usage, training and inference time. In this paper, a novel method integrating a transformer encoder and convolutional features is proposed. This network reduces the memory burden by close to 50% and runs up to four times faster during inference time compared to existing transformer-based interpolation methods. A dual-encoder architecture is introduced which combines the strength of convolutions in modelling local correlations with those of the transformer for long-range dependencies. Quantitative evaluations are conducted on various benchmarks with complex motion to showcase the robustness of the proposed method, achieving competitive performance compared to state-of-the-art interpolation networks.
引用
收藏
页码:1050 / 1054
页数:5
相关论文
共 50 条
  • [1] Video Frame Interpolation Transformer
    Shi, Zhihao
    Xu, Xiangyu
    Liu, Xiaohong
    Chen, Jun
    Yang, Ming-Hsuan
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 17461 - 17470
  • [2] Video Frame Interpolation with Transformer
    Lu, Liying
    Wu, Ruizheng
    Lin, Huaijia
    Lu, Jiangbo
    Jia, Jiaya
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 3522 - 3532
  • [3] Video Frame Interpolation with Flow Transformer
    Gao, Pan
    Tian, Haoyue
    Qin, Jie
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 1933 - 1942
  • [4] Video Frame Interpolation via Adaptive Convolution
    Niklaus, Simon
    Mai, Long
    Liu, Feng
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2270 - 2279
  • [5] Video Frame Interpolation via Generalized Deformable Convolution
    Shi, Zhihao
    Liu, Xiaohong
    Shi, Kangdi
    Dai, Linhui
    Chen, Jun
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 426 - 439
  • [6] Video Frame Interpolation via Deformable Separable Convolution
    Cheng, Xianhang
    Chen, Zhenzhong
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 10607 - 10614
  • [7] Video Frame Interpolation via Adaptive Separable Convolution
    Niklaus, Simon
    Mai, Long
    Liu, Feng
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 261 - 270
  • [8] Progressive Motion Context Refine Network for Efficient Video Frame Interpolation
    Kong, Lingtong
    Liu, Jinfeng
    Yang, Jie
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2338 - 2342
  • [9] Transformer-Based Spatiotemporal Graph Diffusion Convolution Network for Traffic Flow Forecasting
    Wei, Siwei
    Yang, Yang
    Liu, Donghua
    Deng, Ke
    Wang, Chunzhi
    [J]. ELECTRONICS, 2024, 13 (16)
  • [10] Transformer-based Cross Reference Network for video salient object detection
    Huang, Kan
    Tian, Chunwei
    Su, Jingyong
    Lin, Jerry Chun-Wei
    [J]. PATTERN RECOGNITION LETTERS, 2022, 160 : 122 - 127