Semantic-CC: Boosting Remote Sensing Image Change Captioning via Foundational Knowledge and Semantic Guidance

被引:0
|
作者
Zhu, Yongshuo [1 ]
Li, Lu [1 ]
Chen, Keyan [2 ,3 ]
Liu, Chenyang [2 ,3 ]
Zhou, Fugen [1 ]
Shi, Zhenwei [2 ,3 ]
机构
[1] Beihang University, Image Processing Center, School of Astronautics, Beijing,100191, China
[2] Beihang University, Image Processing Center, School of Astronautics, State Key Laboratory of Virtual Reality Technology and Systems, Beijing,100191, China
[3] Shanghai Artificial Intelligence Laboratory, Shanghai,200232, China
基金
中国国家自然科学基金;
关键词
Adaptive boosting - Change detection - Multi-task learning - Optical remote sensing;
D O I
10.1109/TGRS.2024.3497338
中图分类号
学科分类号
摘要
Remote sensing image change captioning (RSICC) aims to articulate the changes in objects of interest within bitemporal remote sensing images using natural language. Given the limitations of current RSICC methods in expressing general features across multitemporal and spatial scenarios, and their deficiency in providing granular, robust, and precise change descriptions, we introduce a novel change captioning (CC) method based on the foundational knowledge and semantic guidance, which we term Semantic-CC. Semantic-CC alleviates the dependency of high-generalization algorithms on extensive annotations by harnessing the latent knowledge of foundation models, and it generates more comprehensive and accurate change descriptions guided by pixel-level semantics from change detection (CD). Specifically, we propose a bitemporal SAM-based encoder for dual-image feature extraction; a multitask semantic aggregation neck for facilitating information interaction between heterogeneous tasks; a straightforward multiscale CD decoder to provide pixel-level semantic guidance; and a change caption decoder based on the large language model (LLM) to generate change description sentences. Moreover, to ensure the stability of the joint training of CD and CC, we propose a three-stage training strategy that supervises different tasks at various stages. We validate the proposed method on the LEVIR-CC and LEVIR-CD datasets. The experimental results corroborate the complementarity of CD and CC, demonstrating that Semantic-CC can generate more accurate change descriptions and achieve optimal performance across both tasks. © 2024 IEEE.
引用
下载
收藏
相关论文
共 50 条
  • [41] Change Captioning: A New Paradigm for Multitemporal Remote Sensing Image Analysis
    Hoxha, Genc
    Chouaf, Seloua
    Melgani, Farid
    Smara, Youcef
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [42] Multi-granularity semantic alignment distillation learning for remote sensing image semantic segmentation
    ZHANG Di
    ZHOU Yong
    ZHAO Jiaqi
    YANG Zhongyuan
    DONG Hui
    YAO Rui
    MA Huifang
    Frontiers of Computer Science, 2022, 16 (04)
  • [43] Multi-granularity semantic alignment distillation learning for remote sensing image semantic segmentation
    Di Zhang
    Yong Zhou
    Jiaqi Zhao
    Zhongyuan Yang
    Hui Dong
    Rui Yao
    Huifang Ma
    Frontiers of Computer Science, 2022, 16
  • [44] Multi-granularity semantic alignment distillation learning for remote sensing image semantic segmentation
    Zhang, Di
    Zhou, Yong
    Zhao, Jiaqi
    Yang, Zhongyuan
    Dong, Hui
    Yao, Rui
    Ma, Huifang
    FRONTIERS OF COMPUTER SCIENCE, 2022, 16 (04)
  • [45] Bi-Temporal Semantic Reasoning for the Semantic Change Detection in HR Remote Sensing Images
    Ding, Lei
    Guo, Haitao
    Liu, Sicong
    Mou, Lichao
    Zhang, Jing
    Bruzzone, Lorenzo
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [46] Remote sensing image semantic segmentation network based on ENet
    Wang, Yiqin
    JOURNAL OF ENGINEERING-JOE, 2022, 2022 (12): : 1219 - 1227
  • [47] STAIR FUSION NETWORK FOR REMOTE SENSING IMAGE SEMANTIC SEGMENTATION
    Hua, Wenyi
    Liu, Jia
    Liu, Fang
    Zhang, Wenhua
    An, Jiaqi
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 5499 - 5502
  • [48] Geographic Ontology Driven Hierarchical Semantic of Remote Sensing Image
    Zhou Xiran
    Shao Zhenfeng
    Liu Jun
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON COMPUTER VISION IN REMOTE SENSING, 2012, : 1 - 6
  • [49] CNN and Transformer Fusion for Remote Sensing Image Semantic Segmentation
    Chen, Xin
    Li, Dongfen
    Liu, Mingzhe
    Jia, Jiaru
    REMOTE SENSING, 2023, 15 (18)
  • [50] Remote Sensing Image Semantic Segmentation Algorithm Based on TransMANet
    Song Xirui
    Ge Hongwei
    LASER & OPTOELECTRONICS PROGRESS, 2024, 61 (10)