Space-time super-resolution for satellite video: A joint framework based on multi-scale spatial-temporal transformer

被引：72

作者：

Xiao, Yi ^{[1
]}

Yuan, Qiangqiang ^{[1
]}

He, Jiang ^{[1
]}

Zhang, Qiang ^{[2
]}

Sun, Jing ^{[3
]}

Su, Xin ^{[4
]}

Wu, Jialian ^{[1
]}

Zhang, Liangpei ^{[2
]}

机构：

[1] Wuhan Univ, Sch Geodesy & Geomat, Wuhan, Peoples R China

[2] Wuhan Univ, State Key Lab Informat Engn Survey Mapping & Remo, Wuhan, Peoples R China

[3] Wuhan Univ, Sch Resource & Environm Sci, Wuhan, Peoples R China

[4] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan, Peoples R China

来源：

INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION | 2022年 / 108卷

基金：

中国国家自然科学基金;

关键词：

Video super-resolution; Video frame interpolation; Jilin-1 satellite video; Deep learning; CONVOLUTIONAL NEURAL-NETWORK; THICK CLOUD; REMOVAL;

D O I：

10.1016/j.jag.2022.102731

中图分类号：

TP7 [遥感技术];

学科分类号：

081102 ; 0816 ; 081602 ; 083002 ; 1404 ;

摘要：

Satellite video is an emerging type of earth observation tool, which has attracted increasing attention because of its application in dynamic analysis. However, most studies only focus on improving the spatial resolution of satellite video imagery. In contrast, few works are committed to enhancing the temporal resolution, and the joint spatial-temporal improvement is even less. The joint spatial-temporal enhancement can not only produce high resolution imagery for subsequent applications, but also provide the potentials of clear motion dynamics for extreme events observation. In this paper, we propose a joint framework to enhance the spatial and temporal resolution of satellite video simultaneously. Firstly, to alleviate the problem of scale variation and scarce motion in satellite video, we design a feature interpolation module that deeply couples optical flow and multi-scale deformable convolution to predict unknown frames. Deformable convolution can adaptively learn the multi scale motion information and profoundly complement optical flow information. Secondly, a multi-scale spatial-temporal transformer is proposed to aggregate the contextual information in long-time series video frames effectively. Since multi-scale patches are embedded in multiple heads for spatial-temporal self-attention calculation, we can comprehensively exploit multi-scale details in all frames. Extensive experiments on the Jilin 1 satellite video demonstrate that our model is superior to the existing methods. The source code is available at https://github.com/XY-boy.

引用

页数：11

共 50 条

[1] Attention-guided video super-resolution with recurrent multi-scale spatial-temporal transformer
Sun, Wei
Kong, Xianguang
Zhang, Yanning
[J]. COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (04) : 3989 - 4002
[2] RSTT: Real-time Spatial Temporal Transformer for Space-Time Video Super-Resolution
Geng, Zhicheng
Liang, Luming
Ding, Tianyu
Zharkov, Ilya
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 17420 - 17430
[3] Space-time video super-resolution via multi-scale feature interpolation and temporal feature fusion
Yang, Caisong
Kong, Guangqian
Duan, Xun
Long, Huiyun
Zhao, Jian
[J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (11) : 8279 - 8291
[4] Attention-guided video super-resolution with recurrent multi-scale spatial–temporal transformer
Wei Sun
Xianguang Kong
Yanning Zhang
[J]. Complex & Intelligent Systems, 2023, 9 : 3989 - 4002
[5] CTVSR: Collaborative Spatial-Temporal Transformer for Video Super-Resolution
Tang, Jun
Lu, Chenyan
Liu, Zhengxue
Li, Jiale
Dai, Hang
Ding, Yong
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (06) : 5018 - 5032
[6] Multi-Scale Video Super-Resolution Transformer With Polynomial Approximation
Zhang, Fan
Chen, Gongguan
Wang, Hua
Li, Jinjiang
Zhang, Caiming
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 4496 - 4506
[7] Space-Time Video Super-Resolution Using Temporal Profiles
Xiao, Zeyu
Xiong, Zhiwei
Fu, Xueyang
Liu, Dong
Zha, Zheng-Jun
[J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 664 - 672
[8] Space-Time Video Super-Resolution 3D Transformer
Zheng, Minyan
Luo, Jianping
[J]. MULTIMEDIA MODELING, MMM 2023, PT II, 2023, 13834 : 374 - 385
[9] Multi-Scale Spatial-Temporal Transformer: A Novel Framework for Spatial-Temporal Edge Data Prediction
Ming, Junhao
Zhang, Dongmei
Han, Wei
[J]. APPLIED SCIENCES-BASEL, 2023, 13 (17):
[10] Temporal Modulation Network for Controllable Space-Time Video Super-Resolution
Xu, Gang
Xu, Jun
Li, Zhen
Wang, Liang
Sun, Xing
Cheng, Ming-Ming
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 6384 - 6393

← 1 2 3 4 5 →