TSFE: Two-Stage Feature Enhancement for Remote Sensing Image Captioning

被引:3
|
作者
Guo, Jie [1 ]
Li, Ze [1 ]
Song, Bin [1 ]
Chi, Yuhao [1 ]
机构
[1] Xidian Univ, State Key Lab Integrated Serv Networks, Xian 710071, Peoples R China
关键词
attention mechanism; fine-grained feature; two-stage enhancement; remote sensing image captioning; feature interaction decoder; FUSION;
D O I
10.3390/rs16111843
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
In the field of remote sensing image captioning (RSIC), mainstream methods typically adopt an encoder-decoder framework. Methods based on this framework often use only simple feature fusion strategies, failing to fully mine the fine-grained features of the remote sensing image. Moreover, the lack of context information introduction in the decoder results in less accurate generated sentences. To address these problems, we propose a two-stage feature enhancement model (TSFE) for remote sensing image captioning. In the first stage, we adopt an adaptive feature fusion strategy to acquire multi-scale features. In the second stage, we further mine fine-grained features based on multi-scale features by establishing associations between different regions of the image. In addition, we introduce global features with scene information in the decoder to help generate descriptions. Experimental results on the RSICD, UCM-Captions, and Sydney-Captions datasets demonstrate that the proposed method outperforms existing state-of-the-art approaches.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] A Joint-Training Two-Stage Method For Remote Sensing Image Captioning
    Ye, Xiutiao
    Wang, Shuang
    Gu, Yu
    Wang, Jihui
    Wang, Ruixuan
    Hou, Biao
    Giunchiglia, Fausto
    Jiao, Licheng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [2] Two-Stage Reranking for Remote Sensing Image Retrieval
    Tang, Xu
    Jiao, Licheng
    Emery, William J.
    Liu, Fang
    Zhang, Dan
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2017, 55 (10): : 5798 - 5817
  • [3] Remote sensing image destriping with two-stage image decomposition network
    Shi, Yu
    Wu, Feiyan
    Guo, Jian
    Li, Xi
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2025, 46 (05) : 2136 - 2158
  • [4] Feature refinement and rethinking attention for remote sensing image captioning
    Li, Yunpeng
    Tao, Chengjin
    Liu, Meng
    Zhang, Xiangrong
    Wang, Guanchun
    Zhang, Tianyang
    Zhao, Dong
    Wang, Dabao
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [5] A two-stage image enhancement and dynamic feature aggregation framework for gastroscopy image segmentation
    He, Dongzhi
    Li, Yunyu
    Chen, Liule
    Liang, Yu
    Xue, Yongle
    Xiao, Xingmei
    Li, Yunqi
    NEUROCOMPUTING, 2024, 601
  • [6] Two-Stage Object Detection Based on Deep Pruning for Remote Sensing Image
    Wang, Shengsheng
    Wang, Meng
    Zhao, Xin
    Liu, Dong
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT (KSEM 2018), PT I, 2018, 11061 : 137 - 147
  • [7] Multi-label semantic feature fusion for remote sensing image captioning
    Wang, Shuang
    Ye, Xiutiao
    Gu, Yu
    Wang, Jihui
    Meng, Yun
    Tian, Jingxian
    Hou, Biao
    Jiao, Licheng
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2022, 184 : 1 - 18
  • [8] A two-stage fusion remote sensing image dehazing network based on multi-scale feature and hybrid attention
    Miao, Mengjun
    Huang, Heming
    Da, Feipeng
    Song, Dongke
    Fan, Yonghong
    Zhang, Miao
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (SUPPL 1) : 373 - 383
  • [9] Denoising-Based Multiscale Feature Fusion for Remote Sensing Image Captioning
    Huang, Wei
    Wang, Qi
    Li, Xuelong
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2021, 18 (03) : 436 - 440
  • [10] TESR: Two-Stage Approach for Enhancement and Super-Resolution of Remote Sensing Images
    Ali, Anas M.
    Benjdira, Bilel
    Koubaa, Anis
    Boulila, Wadii
    El-Shafai, Walid
    REMOTE SENSING, 2023, 15 (09)