Semantic-CC: Boosting Remote Sensing Image Change Captioning via Foundational Knowledge and Semantic Guidance

被引:0
|
作者
Zhu, Yongshuo [1 ]
Li, Lu [1 ]
Chen, Keyan [2 ,3 ]
Liu, Chenyang [2 ,3 ]
Zhou, Fugen [1 ]
Shi, Zhenwei [2 ,3 ]
机构
[1] Beihang University, Image Processing Center, School of Astronautics, Beijing,100191, China
[2] Beihang University, Image Processing Center, School of Astronautics, State Key Laboratory of Virtual Reality Technology and Systems, Beijing,100191, China
[3] Shanghai Artificial Intelligence Laboratory, Shanghai,200232, China
基金
中国国家自然科学基金;
关键词
Adaptive boosting - Change detection - Multi-task learning - Optical remote sensing;
D O I
10.1109/TGRS.2024.3497338
中图分类号
学科分类号
摘要
Remote sensing image change captioning (RSICC) aims to articulate the changes in objects of interest within bitemporal remote sensing images using natural language. Given the limitations of current RSICC methods in expressing general features across multitemporal and spatial scenarios, and their deficiency in providing granular, robust, and precise change descriptions, we introduce a novel change captioning (CC) method based on the foundational knowledge and semantic guidance, which we term Semantic-CC. Semantic-CC alleviates the dependency of high-generalization algorithms on extensive annotations by harnessing the latent knowledge of foundation models, and it generates more comprehensive and accurate change descriptions guided by pixel-level semantics from change detection (CD). Specifically, we propose a bitemporal SAM-based encoder for dual-image feature extraction; a multitask semantic aggregation neck for facilitating information interaction between heterogeneous tasks; a straightforward multiscale CD decoder to provide pixel-level semantic guidance; and a change caption decoder based on the large language model (LLM) to generate change description sentences. Moreover, to ensure the stability of the joint training of CD and CC, we propose a three-stage training strategy that supervises different tasks at various stages. We validate the proposed method on the LEVIR-CC and LEVIR-CD datasets. The experimental results corroborate the complementarity of CD and CC, demonstrating that Semantic-CC can generate more accurate change descriptions and achieve optimal performance across both tasks. © 2024 IEEE.
引用
下载
收藏
相关论文
共 50 条
  • [31] Semantic-Aware Dense Representation Learning for Remote Sensing Image Change Detection
    Chen, Hao
    Li, Wenyuan
    Chen, Song
    Shi, Zhenwei
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [32] HFNet: Semantic and Differential Heterogenous Fusion Network for Remote Sensing Image Change Detection
    Han, Yang
    Li, Jiayi
    Qu, Yang
    Wang, Leiguang
    Pan, Xiaofeng
    Huang, Xin
    Journal of Geovisualization and Spatial Analysis, 2025, 9 (01)
  • [33] High-Order Semantic Decoupling Network for Remote Sensing Image Semantic Segmentation
    Zheng, Chengyu
    Nie, Jie
    Wang, Zhaoxin
    Song, Ning
    Wang, Jingyu
    Wei, Zhiqiang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [34] Domain Adaptive Remote Sensing Scene Recognition via Semantic Relationship Knowledge Transfer
    Zhao, Ying
    Li, Shuang
    Liu, Chi Harold
    Han, Yuqi
    Shi, Hao
    Li, Wei
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [35] Semantic segmentation of remote sensing ship image via a convolutional neural networks model
    Wang, Wenxiu
    Fu, Yutian
    Dong, Feng
    Li, Feng
    IET IMAGE PROCESSING, 2019, 13 (06) : 1016 - 1022
  • [36] Prior Knowledge-Guided Transformer for Remote Sensing Image Captioning
    Meng, Lingwu
    Wang, Jing
    Yang, Yang
    Xiao, Liang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61 : 1 - 13
  • [37] Watershed-Based Attribute Profiles With Semantic Prior Knowledge for Remote Sensing Image Analysis
    Maia, Deise Santana
    Pham, Minh-Tan
    Lefevre, Sebastien
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2022, 15 : 2574 - 2591
  • [38] RSCaMa: Remote Sensing Image Change Captioning With State Space Model
    Liu, Chenyang
    Chen, Keyan
    Chen, Bowen
    Zhang, Haotian
    Zou, Zhengxia
    Shi, Zhenwei
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5
  • [39] A Decoupling Paradigm With Prompt Learning for Remote Sensing Image Change Captioning
    Liu, Chenyang
    Zhao, Rui
    Chen, Jianqi
    Qi, Zipeng
    Zou, Zhengxia
    Shi, Zhenwei
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [40] A Lightweight Sparse Focus Transformer for Remote Sensing Image Change Captioning
    Sun, Dongwei
    Bao, Yajie
    Liu, Junmin
    Cao, Xiangyong
    IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024, 17 : 18727 - 18738