Towards a Multimodal Framework for Remote Sensing Image Change Retrieval and Captioning

被引:0
|
作者
Ferrod, Roger [1 ]
Di Caro, Luigi [1 ]
Ienco, Dino [2 ,3 ]
机构
[1] Univ Turin, Turin, Italy
[2] Univ Montpellier, INRAE, UMR TETIS, Montpellier, France
[3] Univ Montpellier, INRIA, Montpellier, France
来源
关键词
Remote Sensing; bi-temporal change detection; image captioning; text-image retrieval; contrastive learning;
D O I
10.1007/978-3-031-78980-9_15
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, there has been increasing interest in multimodal applications that integrate text with other modalities, such as images, audio and video, to facilitate natural language interactions with multimodal AI systems. While applications involving standard modalities have been extensively explored, there is still a lack of investigation into specific data modalities such as remote sensing (RS) data. Despite the numerous potential applications of RS data, including environmental protection, disaster monitoring and land planning, available solutions are predominantly focused on specific tasks like classification, captioning and retrieval. These solutions often overlook the unique characteristics of RS data, such as its capability to systematically provide information on the same geographical areas over time. This ability enables continuous monitoring of changes in the underlying landscape. To address this gap, we propose a novel foundation model for bitemporal RS image pairs, in the context of change detection analysis, leveraging Contrastive Learning and the LEVIR-CC dataset for both captioning and text-image retrieval. By jointly training a contrastive encoder and captioning decoder, our model add text-image retrieval capabilities, in the context of bi-temporal change detection, while maintaining captioning performances that are comparable to the state of the art. We release the source code and pretrained weights at: https://github. com/rogerferrod/RSICRC.
引用
收藏
页码:231 / 245
页数:15
相关论文
共 50 条
  • [1] WordSentence Framework for Remote Sensing Image Captioning
    Wang, Qi
    Huang, Wei
    Zhang, Xueting
    Li, Xuelong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2021, 59 (12): : 10532 - 10543
  • [2] Meta captioning: A meta learning based remote sensing image captioning framework
    Yang, Qiaoqiao
    Ni, Zihao
    Ren, Peng
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2022, 186 : 190 - 200
  • [3] Toward Remote Sensing Image Retrieval Under a Deep Image Captioning Perspective
    Hoxha, Genc
    Melgani, Farid
    Demir, Begum
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2020, 13 : 4462 - 4475
  • [4] Sound Active Attention Framework for Remote Sensing Image Captioning
    Lu, Xiaoqiang
    Wang, Binqiang
    Zheng, Xiangtao
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2020, 58 (03): : 1985 - 2000
  • [5] Retrieval Topic Recurrent Memory Network for Remote Sensing Image Captioning
    Wang, Binqiang
    Zheng, Xiangtao
    Qu, Bo
    Lu, Xiaoqiang
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2020, 13 : 256 - 270
  • [6] Towards Developing a Unified Multimodal Image Retrieval Framework
    Zhang, Zhongfei
    Guo, Zhen
    Zhang, Ruofei
    ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 1656 - +
  • [7] FRIC: a framework for few-shot remote sensing image captioning
    Zhou, Haonan
    Xia, Lurui
    Du, Xiaoping
    Li, Sen
    INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2024, 17 (01)
  • [8] A NEW CNN-RNN FRAMEWORK FOR REMOTE SENSING IMAGE CAPTIONING
    Hoxha, Genc
    Melgani, Farid
    Slaghenauffi, Jacopo
    2020 MEDITERRANEAN AND MIDDLE-EAST GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (M2GARSS), 2020, : 1 - 4
  • [9] Cross-Modal Retrieval and Semantic Refinement for Remote Sensing Image Captioning
    Li, Zhengxin
    Zhao, Wenzhe
    Du, Xuanyi
    Zhou, Guangyao
    Zhang, Songlin
    REMOTE SENSING, 2024, 16 (01)
  • [10] Multimodal Fusion Remote Sensing Image-Audio Retrieval
    Yang, Rui
    Wang, Shuang
    Sun, Yingzhi
    Zhang, Huan
    Liao, Yu
    Gu, Yu
    Hou, Biao
    Jiao, Licheng
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2022, 15 : 6220 - 6235