Space-time super-resolution for satellite video: A joint framework based on multi-scale spatial-temporal transformer

被引:72
|
作者
Xiao, Yi [1 ]
Yuan, Qiangqiang [1 ]
He, Jiang [1 ]
Zhang, Qiang [2 ]
Sun, Jing [3 ]
Su, Xin [4 ]
Wu, Jialian [1 ]
Zhang, Liangpei [2 ]
机构
[1] Wuhan Univ, Sch Geodesy & Geomat, Wuhan, Peoples R China
[2] Wuhan Univ, State Key Lab Informat Engn Survey Mapping & Remo, Wuhan, Peoples R China
[3] Wuhan Univ, Sch Resource & Environm Sci, Wuhan, Peoples R China
[4] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan, Peoples R China
基金
中国国家自然科学基金;
关键词
Video super-resolution; Video frame interpolation; Jilin-1 satellite video; Deep learning; CONVOLUTIONAL NEURAL-NETWORK; THICK CLOUD; REMOVAL;
D O I
10.1016/j.jag.2022.102731
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
Satellite video is an emerging type of earth observation tool, which has attracted increasing attention because of its application in dynamic analysis. However, most studies only focus on improving the spatial resolution of satellite video imagery. In contrast, few works are committed to enhancing the temporal resolution, and the joint spatial-temporal improvement is even less. The joint spatial-temporal enhancement can not only produce high resolution imagery for subsequent applications, but also provide the potentials of clear motion dynamics for extreme events observation. In this paper, we propose a joint framework to enhance the spatial and temporal resolution of satellite video simultaneously. Firstly, to alleviate the problem of scale variation and scarce motion in satellite video, we design a feature interpolation module that deeply couples optical flow and multi-scale deformable convolution to predict unknown frames. Deformable convolution can adaptively learn the multi scale motion information and profoundly complement optical flow information. Secondly, a multi-scale spatial-temporal transformer is proposed to aggregate the contextual information in long-time series video frames effectively. Since multi-scale patches are embedded in multiple heads for spatial-temporal self-attention calculation, we can comprehensively exploit multi-scale details in all frames. Extensive experiments on the Jilin 1 satellite video demonstrate that our model is superior to the existing methods. The source code is available at https://github.com/XY-boy.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Attention-guided video super-resolution with recurrent multi-scale spatial-temporal transformer
    Sun, Wei
    Kong, Xianguang
    Zhang, Yanning
    [J]. COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (04) : 3989 - 4002
  • [2] RSTT: Real-time Spatial Temporal Transformer for Space-Time Video Super-Resolution
    Geng, Zhicheng
    Liang, Luming
    Ding, Tianyu
    Zharkov, Ilya
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 17420 - 17430
  • [3] Space-time video super-resolution via multi-scale feature interpolation and temporal feature fusion
    Yang, Caisong
    Kong, Guangqian
    Duan, Xun
    Long, Huiyun
    Zhao, Jian
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (11) : 8279 - 8291
  • [4] Attention-guided video super-resolution with recurrent multi-scale spatial–temporal transformer
    Wei Sun
    Xianguang Kong
    Yanning Zhang
    [J]. Complex & Intelligent Systems, 2023, 9 : 3989 - 4002
  • [5] CTVSR: Collaborative Spatial-Temporal Transformer for Video Super-Resolution
    Tang, Jun
    Lu, Chenyan
    Liu, Zhengxue
    Li, Jiale
    Dai, Hang
    Ding, Yong
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (06) : 5018 - 5032
  • [6] Multi-Scale Video Super-Resolution Transformer With Polynomial Approximation
    Zhang, Fan
    Chen, Gongguan
    Wang, Hua
    Li, Jinjiang
    Zhang, Caiming
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 4496 - 4506
  • [7] Space-Time Video Super-Resolution Using Temporal Profiles
    Xiao, Zeyu
    Xiong, Zhiwei
    Fu, Xueyang
    Liu, Dong
    Zha, Zheng-Jun
    [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 664 - 672
  • [8] Space-Time Video Super-Resolution 3D Transformer
    Zheng, Minyan
    Luo, Jianping
    [J]. MULTIMEDIA MODELING, MMM 2023, PT II, 2023, 13834 : 374 - 385
  • [9] Multi-Scale Spatial-Temporal Transformer: A Novel Framework for Spatial-Temporal Edge Data Prediction
    Ming, Junhao
    Zhang, Dongmei
    Han, Wei
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (17):
  • [10] Temporal Modulation Network for Controllable Space-Time Video Super-Resolution
    Xu, Gang
    Xu, Jun
    Li, Zhen
    Wang, Liang
    Sun, Xing
    Cheng, Ming-Ming
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 6384 - 6393