Self-Supervised Spatio-Temporal Representation Learning of Satellite Image Time Series

被引:1
|
作者
Dumeur, Iris [1 ]
Valero, Silvia [1 ]
Inglada, Jordi [1 ]
机构
[1] Univ Toulouse, CESBIO, F-31000 Toulouse, France
关键词
Representation learning; satellite image time series (SITS); self-supervised learning (SSL); spatio-temporal network; transformer; Unet; NEURAL-NETWORKS;
D O I
10.1109/JSTARS.2024.3358066
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this article, a new self-supervised strategy for learning meaningful representations of complex optical satellite image time series (SITS) is presented. The methodology proposed, named Unet-BERT spAtio-temporal Representation eNcoder (U-BARN), exploits irregularly sampled SITS. The designed architecture allows learning rich and discriminative features from unlabeled data, enhancing the synergy between the spatio-spectral and the temporal dimensions. To train on unlabeled data, a time-series reconstruction pretext task inspired by the BERT strategy but adapted to SITS is proposed. A Sentinel-2 large-scale unlabeled dataset is used to pretrain U-BARN. During the pretraining, U-BARN processes annual time series composed of a maximum of 100 dates. To demonstrate its feature learning capability, representations of SITS encoded by U-BARN are then fed into a shallow classifier to generate semantic segmentation maps. Experimental results are conducted on a labeled crop dataset (PASTIS) as well as a dense land cover dataset (MultiSenGE). Two ways of exploiting U-BARN pretraining are considered: either U-BARN weights are frozen or fine-tuned. The obtained results demonstrate that representations of SITS given by the frozen U-BARN are more efficient for land cover and crop classification than those of a supervised-trained linear layer. Then, we observe that fine-tuning boosts U-BARN performances on MultiSenGE dataset. In addition, we observe on PASTIS, in scenarios with scarce reference data that the fine-tuning brings a significative performance gain compared to fully supervised approaches. We also investigate the influence of the percentage of elements masked during pretraining on the quality of the SITS representation. Eventually, semantic segmentation performances show that the fully supervised U-BARN architecture reaches better performances than the spatio-temporal baseline (U-TAE) on both downstream tasks: crop and dense land cover segmentation.
引用
收藏
页码:4350 / 4367
页数:18
相关论文
共 50 条
  • [1] SELF-SUPERVISED SPATIO-TEMPORAL REPRESENTATION LEARNING OF SATELLITE IMAGE TIME SERIES
    Dumeur, Iris
    Valero, Silvia
    Inglada, Jordi
    [J]. IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 642 - 645
  • [2] Self-Supervised Video Representation Learning by Uncovering Spatio-Temporal Statistics
    Wang, Jiangliu
    Jiao, Jianbo
    Bao, Linchao
    He, Shengfeng
    Liu, Wei
    Liu, Yun-hui
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (07) : 3791 - 3806
  • [3] Contrastive Spatio-Temporal Pretext Learning for Self-Supervised Video Representation
    Zhang, Yujia
    Po, Lai-Man
    Xu, Xuyuan
    Liu, Mengyang
    Wang, Yexin
    Ou, Weifeng
    Zhao, Yuzhi
    Yu, Wing-Yin
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 3380 - 3389
  • [4] Video Playback Rate Perception for Self-supervised Spatio-Temporal Representation Learning
    Yao, Yuan
    Liu, Chang
    Luo, Dezhao
    Zhou, Yu
    Ye, Qixiang
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 6547 - 6556
  • [5] Joint spatio-temporal features constrained self-supervised electrocardiogram representation learning
    Ran, Ao
    Liu, Huafeng
    [J]. BIOMEDICAL ENGINEERING LETTERS, 2024, 14 (02) : 209 - 220
  • [6] Joint spatio-temporal features constrained self-supervised electrocardiogram representation learning
    Ao Ran
    Huafeng Liu
    [J]. Biomedical Engineering Letters, 2024, 14 : 209 - 220
  • [7] Self-supervised Spatio-temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics
    Wang, Jiangliu
    Jiao, Jianbo
    Bao, Linchao
    He, Shengfeng
    Liu, Yunhui
    Liu, Wei
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4001 - 4010
  • [8] Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds
    Huang, Siyuan
    Degrees, Yichen Xie
    Zhu, Song-Chun
    Zhu, Yixin
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 6515 - 6525
  • [9] Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning
    Luo, Dezhao
    Liu, Chang
    Zhou, Yu
    Yang, Dongbao
    Ma, Can
    Ye, Qixiang
    Wang, Weiping
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11701 - 11708
  • [10] Spatio-Temporal Self-Supervised Learning for Traffic Flow Prediction
    Ji, Jiahao
    Wang, Jingyuan
    Huang, Chao
    Wu, Junjie
    Xu, Boren
    Wu, Zhenhe
    Zhang, Junbo
    Zheng, Yu
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 4, 2023, : 4356 - 4364