Multi-Source Interactive Stair Attention for Remote Sensing Image Captioning

被引:17
|
作者
Zhang, Xiangrong [1 ]
Li, Yunpeng [1 ]
Wang, Xin [1 ]
Liu, Feixiang [1 ]
Wu, Zhaoji [1 ]
Cheng, Xina [1 ]
Jiao, Licheng [1 ]
机构
[1] Xidian Univ, Sch Artificial Intelligence, Key Lab Intelligent Percept & Image Understanding, Minist Educ, Xian 710071, Peoples R China
基金
中国国家自然科学基金;
关键词
remote sensing image captioning; cross-modal interaction; attention mechanism; semantic information; encoder-decoder; TRANSFORMER; NETWORK;
D O I
10.3390/rs15030579
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The aim of remote sensing image captioning (RSIC) is to describe a given remote sensing image (RSI) using coherent sentences. Most existing attention-based methods model the coherence through an LSTM-based decoder, which dynamically infers a word vector from preceding sentences. However, these methods are indirectly guided through the confusion of attentive regions, as (1) the weighted average in the attention mechanism distracts the word vector from capturing pertinent visual regions and (2) there are few constraints or rewards for learning long-range transitions. In this paper, we propose a multi-source interactive stair attention mechanism that separately models the semantics of preceding sentences and visual regions of interest. Specifically, the multi-source interaction takes previous semantic vectors as queries and applies an attention mechanism on regional features to acquire the next word vector, which reduces immediate hesitation by considering linguistics. The stair attention divides the attentive weights into three levels-that is, the core region, the surrounding region, and other regions-and all regions in the search scope are focused on differently. Then, a CIDEr-based reward reinforcement learning is devised, in order to enhance the quality of the generated sentences. Comprehensive experiments on widely used benchmarks (i.e., the Sydney-Captions, UCM-Captions, and RSICD data sets) demonstrate the superiority of the proposed model over state-of-the-art models, in terms of its coherence, while maintaining high accuracy.
引用
收藏
页数:22
相关论文
共 50 条
  • [41] A Target Recognition Algorithm of Multi-Source Remote Sensing Image Based on Visual Internet of Things
    Sun, Xue-jun
    Lin, Jerry Chun-Wei
    MOBILE NETWORKS & APPLICATIONS, 2022, 27 (02): : 784 - 793
  • [42] Research on Multi-source Remote Sensing Image Registration Base on SIFT Algorithm of Window Segmentation
    Jiang Yun
    Wang Jun
    2010 6TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS NETWORKING AND MOBILE COMPUTING (WICOM), 2010,
  • [43] Multi-source Remote Sensing Image Registration Based on Contourlet Transform and Multiple Feature Fusion
    Huan Liu
    Gen-Fu Xiao
    Yun-Lan Tan
    Chun-Juan Ouyang
    International Journal of Automation and Computing, 2019, 16 : 575 - 588
  • [44] A robust multi-source remote-sensing image registration method based on feature matching
    Ling, Zhi-Gang
    Liang, Yan
    Cheng, Yong-Mei
    Pan, Quan
    Shen, He
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2010, 38 (12): : 2892 - 2897
  • [45] Evaluation of Multi-Source High-Resolution Remote Sensing Image Fusion in Aquaculture Areas
    Zhou, Weifeng
    Wang, Fei
    Wang, Xi
    Tang, Fenghua
    Li, Jiasheng
    APPLIED SCIENCES-BASEL, 2022, 12 (03):
  • [46] Semi-supervised label propagation for multi-source remote sensing image change detection
    Hao, Fan
    Ma, Zong-Fang
    Tian, Hong-Peng
    Wang, Hao
    Wu, Di
    COMPUTERS & GEOSCIENCES, 2023, 170
  • [47] Multi-source remote-sensing image matching based on epipolar line and least squares
    Chen, Peng
    Mao, Zhihua
    Chen, Jianyu
    Zhang, Xiaoping
    Li, Zifeng
    IMAGE AND SIGNAL PROCESSING FOR REMOTE SENSING XIX, 2013, 8892
  • [48] The research on the multi-source data assisting remote sensing image Management and issuing based on the network
    Wang, Xiaohua
    Wang, Shudong
    Cao, Xiuli
    Li, Qiu
    2006 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, VOLS 1-8, 2006, : 1003 - +
  • [49] Multi-source Remote Sensing Image Registration Based on Contourlet Transform and Multiple Feature Fusion
    Liu, Huan
    Xiao, Gen-Fu
    Tan, Yun-Lan
    Ouyang, Chun-Juan
    INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING, 2019, 16 (05) : 575 - 588
  • [50] Modeling Multi-source Remote Sensing Image Classifier Based on the MDL Principle: Experimental Studies
    Xia, Huaiying
    Hu, Rukun
    Xu, Bingxin
    Guo, Ping
    IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2010), 2010,