Video Frame Interpolation Transformer

被引:49
|
作者
Shi, Zhihao [1 ]
Xu, Xiangyu [2 ]
Liu, Xiaohong [3 ]
Chen, Jun [1 ]
Yang, Ming-Hsuan [4 ,5 ,6 ]
机构
[1] McMaster Univ, Hamilton, ON, Canada
[2] Nanyang Technol Univ, Singapore, Singapore
[3] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[4] Univ Calif Merced, Merced, CA USA
[5] Yonsei Univ, Seoul, South Korea
[6] Google Res, Mountain View, CA USA
关键词
D O I
10.1109/CVPR52688.2022.01696
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing methods for video interpolation heavily rely on deep convolution neural networks, and thus suffer from their intrinsic limitations, such as content-agnostic kernel weights and restricted receptive field. To address these issues, we propose a Transformer-based video interpolation framework that allows content-aware aggregation weights and considers long-range dependencies with the self-attention operations. To avoid the high computational cost of global self-attention, we introduce the concept of local attention into video interpolation and extend it to the spatial-temporal domain. Furthermore, we propose a space-time separation strategy to save memory usage, which also improves performance. In addition, we develop a multi-scale frame synthesis scheme to fully realize the potential of Transformers. Extensive experiments demonstrate the proposed model performs favorably against the state-of-the-art methods both quantitatively and qualitatively on a variety of benchmark datasets. The code and models are released at https : //github . com/zhshi0816/Video-Frame-Interpolation-Transformer.
引用
收藏
页码:17461 / 17470
页数:10
相关论文
共 50 条
  • [1] Video Frame Interpolation with Transformer
    Lu, Liying
    Wu, Ruizheng
    Lin, Huaijia
    Lu, Jiangbo
    Jia, Jiaya
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 3522 - 3532
  • [2] Video Frame Interpolation with Flow Transformer
    Gao, Pan
    Tian, Haoyue
    Qin, Jie
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 1933 - 1942
  • [3] EFFICIENT CONVOLUTION AND TRANSFORMER-BASED NETWORK FOR VIDEO FRAME INTERPOLATION
    Khalifeh, Issa
    Murn, Luka
    Mrak, Marta
    Izquierdo, Ebroul
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1050 - 1054
  • [4] Parallel Spatio-Temporal Attention Transformer for Video Frame Interpolation
    Ning, Xin
    Cai, Feifan
    Li, Yuhang
    Ding, Youdong
    [J]. ELECTRONICS, 2024, 13 (10)
  • [5] TTVFI: Learning Trajectory-Aware Transformer for Video Frame Interpolation
    Liu, Chengxu
    Yang, Huan
    Fu, Jianlong
    Qian, Xueming
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 4728 - 4741
  • [6] Blurry Video Frame Interpolation
    Shen, Wang
    Bao, Wenbo
    Zhai, Guangtao
    Chen, Li
    Min, Xiongkuo
    Gao, Zhiyong
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 5113 - 5122
  • [7] PhaseNet for Video Frame Interpolation
    Meyer, Simone
    Djelouah, Abdelaziz
    McWilliams, Brian
    Sorkine-Hornung, Alexander
    Gross, Markus
    Schroers, Christopher
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 498 - 507
  • [8] Frame Interpolation Transformer and Uncertainty Guidance
    Plack, Markus
    Hullin, Matthias B.
    Briedis, Karlis Martins
    Gross, Markus
    Djelouah, Abdelaziz
    Schroers, Christopher
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 9811 - 9821
  • [9] Softmax Splatting for Video Frame Interpolation
    Niklaus, Simon
    Liu, Feng
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 5436 - 5445
  • [10] Exploring Discontinuity for Video Frame Interpolation
    Lee, Sangjin
    Lee, Hyeongmin
    Shin, Chajin
    Son, Hanbin
    Lee, Sangyoun
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 9791 - 9800