A Novel Actor Dual-Critic Model for Remote Sensing Image Captioning

被引:4
|
作者
Chavhan, Ruchika [1 ]
Banerjee, Biplab [1 ]
Zhu, Xiao Xiang [2 ]
Chaudhuri, Subhasis [1 ]
机构
[1] Indian Inst Technol, Mumbai, Maharashtra, India
[2] Tech Univ Munich, Signal Proc Earth Observat, Munich, Germany
关键词
D O I
10.1109/ICPR48806.2021.9412486
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We deal with the problem of generating textual captions from optical remote sensing (RS) images using the notion of deep reinforcement learning. Due to the high inter-class similarity in reference sentences describing remote sensing data, jointly encoding the sentences and images encourages prediction of captions that are semantically more precise than the ground truth in many cases. To this end, we introduce an Actor Dual-Critic training strategy where a second critic model is deployed in the form of an encoder-decoder RNN to encode the latent information corresponding to the original and generated captions. While all actor-critic methods use an actor to predict sentences for an image and a critic to provide rewards, our proposed encoder-decoder RNN guarantees high-level comprehension of images by sentence-to-image translation. We observe that the proposed model generates sentences on the test data highly similar to the ground truth and is successful in generating even better captions in many critical cases. Extensive experiments on the benchmark Remote Sensing Image Captioning Dataset (RSICD) and the UCM-captions dataset confirm the superiority of the proposed approach in comparison to the previous state-of-the-art where we obtain a gain of sharp increments in both the ROUGE-L and CIDEr measures.
引用
收藏
页码:4918 / 4925
页数:8
相关论文
共 50 条
  • [1] Image captioning with residual swin transformer and Actor-Critic
    Zhou, Zhibo
    Yang, Yang
    Li, Zhoujun
    Zhang, Xiaoming
    Huang, Feiran
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022,
  • [2] GLCM: Global-Local Captioning Model for Remote Sensing Image Captioning
    Wang, Qi
    Huang, Wei
    Zhang, Xueting
    Li, Xuelong
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (11) : 6910 - 6922
  • [3] A Novel SVM-Based Decoder for Remote Sensing Image Captioning
    Hoxha, Genc
    Melgani, Farid
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [4] RSCaMa: Remote Sensing Image Change Captioning With State Space Model
    Liu, Chenyang
    Chen, Keyan
    Chen, Bowen
    Zhang, Haotian
    Zou, Zhengxia
    Shi, Zhenwei
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5
  • [5] VAA: Visual Aligning Attention Model for Remote Sensing Image Captioning
    Zhang, Zhengyuan
    Zhang, Wenkai
    Diao, Wenhui
    Yan, Menglong
    Ga, Xin
    Sun, Xian
    [J]. IEEE ACCESS, 2019, 7 : 137355 - 137364
  • [6] Region Driven Remote Sensing Image Captioning
    Kumar, S. Chandeesh
    Hemalatha, M.
    Narayan, S. Badri
    Nandhini, P.
    [J]. 2ND INTERNATIONAL CONFERENCE ON RECENT TRENDS IN ADVANCED COMPUTING ICRTAC -DISRUP - TIV INNOVATION , 2019, 2019, 165 : 32 - 40
  • [7] WordSentence Framework for Remote Sensing Image Captioning
    Wang, Qi
    Huang, Wei
    Zhang, Xueting
    Li, Xuelong
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2021, 59 (12): : 10532 - 10543
  • [8] A Systematic Survey of Remote Sensing Image Captioning
    Zhao, Beigeng
    [J]. IEEE ACCESS, 2021, 9 : 154086 - 154111
  • [9] Aware-Transformer: A Novel Pure Transformer-Based Model for Remote Sensing Image Captioning
    Cao, Yukun
    Yan, Jialuo
    Tang, Yijia
    He, Zhenyi
    Xu, Kangle
    Cheng, Yu
    [J]. ADVANCES IN COMPUTER GRAPHICS, CGI 2023, PT I, 2024, 14495 : 105 - 117
  • [10] Meta captioning: A meta learning based remote sensing image captioning framework
    Yang, Qiaoqiao
    Ni, Zihao
    Ren, Peng
    [J]. ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2022, 186 : 190 - 200