Towards a Multimodal Framework for Remote Sensing Image Change Retrieval and Captioning

被引：0

作者：

Ferrod, Roger ^{[1
]}

Di Caro, Luigi ^{[1
]}

Ienco, Dino ^{[2
,3
]}

机构：

[1] Univ Turin, Turin, Italy

[2] Univ Montpellier, INRAE, UMR TETIS, Montpellier, France

[3] Univ Montpellier, INRIA, Montpellier, France

来源：

DISCOVERY SCIENCE, DS 2024, PT II | 2025年 / 15244卷

关键词：

Remote Sensing; bi-temporal change detection; image captioning; text-image retrieval; contrastive learning;

D O I：

10.1007/978-3-031-78980-9_15

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently, there has been increasing interest in multimodal applications that integrate text with other modalities, such as images, audio and video, to facilitate natural language interactions with multimodal AI systems. While applications involving standard modalities have been extensively explored, there is still a lack of investigation into specific data modalities such as remote sensing (RS) data. Despite the numerous potential applications of RS data, including environmental protection, disaster monitoring and land planning, available solutions are predominantly focused on specific tasks like classification, captioning and retrieval. These solutions often overlook the unique characteristics of RS data, such as its capability to systematically provide information on the same geographical areas over time. This ability enables continuous monitoring of changes in the underlying landscape. To address this gap, we propose a novel foundation model for bitemporal RS image pairs, in the context of change detection analysis, leveraging Contrastive Learning and the LEVIR-CC dataset for both captioning and text-image retrieval. By jointly training a contrastive encoder and captioning decoder, our model add text-image retrieval capabilities, in the context of bi-temporal change detection, while maintaining captioning performances that are comparable to the state of the art. We release the source code and pretrained weights at: https://github. com/rogerferrod/RSICRC.

引用

页码：231 / 245

页数：15

共 50 条

[41] An improved remote sensing image retrieval method based on bag of word framework
Yang, Jin
Liu, Jianbo
Dai, Qin
Wuhan Daxue Xuebao (Xinxi Kexue Ban)/Geomatics and Information Science of Wuhan University, 2014, 39 (09): : 1109 - 1113
[42] A Prior Instruction Representation Framework for Remote Sensing Image-text Retrieval
Pan, Jiancheng
Ma, Qing
Bai, Cong
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 611 - 620
[43] Novel Enhanced UNet for Change Detection Using Multimodal Remote Sensing Image
Lv, Zhiyong
Huang, Haitao
Sun, Weiwei
Lei, Tao
Benediktsson, Jon Atli
Li, Junhuai
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
[44] Feature refinement and rethinking attention for remote sensing image captioning
Li, Yunpeng
Tao, Chengjin
Liu, Meng
Zhang, Xiangrong
Wang, Guanchun
Zhang, Tianyang
Zhao, Dong
Wang, Dabao
SCIENTIFIC REPORTS, 2025, 15 (01):
[45] Region-guided transformer for remote sensing image captioning
Zhao, Kai
Xiong, Wei
INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2024, 17 (01)
[46] REMOTE SENSING IMAGE CAPTIONING WITH SVM-BASED DECODING
Hoxha, Genc
Melgani, Farid
IGARSS 2020 - 2020 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2020, : 6734 - 6737
[47] Exploring Transformer and Multilabel Classification for Remote Sensing Image Captioning
Kandala, Hitesh
Saha, Sudipan
Banerjee, Biplab
Zhu, Xiao Xiang
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[48] Changes to Captions: An Attentive Network for Remote Sensing Change Captioning
Chang, Shizhen
Ghamisi, Pedram
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 6047 - 6060
[49] Truncation Cross Entropy Loss for Remote Sensing Image Captioning
Li, Xuelong
Zhang, Xueting
Huang, Wei
Wang, Qi
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2021, 59 (06): : 5246 - 5257
[50] Multiscale Methods for Optical Remote-Sensing Image Captioning
Ma, Xiaofeng
Zhao, Rui
Shi, Zhenwei
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2021, 18 (11) : 2001 - 2005

← 1 2 3 4 5 →