Multiscale Methods for Optical Remote-Sensing Image Captioning

被引:20
|
作者
Ma, Xiaofeng [1 ,2 ,3 ]
Zhao, Rui [1 ,2 ,3 ]
Shi, Zhenwei [1 ,2 ,3 ]
机构
[1] Beihang Univ, Sch Astronaut, Image Proc Ctr, Beijing 100191, Peoples R China
[2] Beihang Univ, Beijing Key Lab Digital Media, Beijing 100191, Peoples R China
[3] Beihang Univ, Sch Astronaut, State Key Lab Virtual Real Technol & Syst, Beijing 100191, Peoples R China
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Feature extraction; Remote sensing; Task analysis; Optical imaging; Semantics; Training; Measurement; Remote-sensing image captioning; multiscale; auxiliary task; attention;
D O I
10.1109/LGRS.2020.3009243
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Recently, the optical remote-sensing image-captioning task has gradually become a research hotspot because of its application prospects in the military and civil fields. Many different methods along with data sets have been proposed. Among them, the models following the encoder-decoder framework have better performance in many aspects like generating more accurate and flexible sentences. However, almost all these methods are of a single fixed receptive field and could not put enough attention on grabbing the multiscale information, which leads to incomplete image representation. In this letter, we deal with the multiscale problem and propose two multiscale methods named multiscale attention (MSA) method and multifeat attention (MFA) method, to obtain better representations for the captioning task in the remote-sensing field. The MSA method extracts features from different layers and uses the multihead attention mechanism to obtain the context feature, respectively. The MFA method combines the target-level features and the scene-level features by using the target-detection task as the auxiliary task to enrich the context feature. The experimental results demonstrate that both of them perform better with regard to the metrics like BLEU, METEOR, ROUGE_L, and CIDEr than the benchmark method.
引用
收藏
页码:2001 / 2005
页数:5
相关论文
共 50 条
  • [31] Exploring region features in remote sensing image captioning
    Zhao, Kai
    Xiong, Wei
    [J]. INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2024, 127
  • [32] Cooperative Connection Transformer for Remote Sensing Image Captioning
    Zhao, Kai
    Xiong, Wei
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 14
  • [33] Remote-sensing Fusion by Multiscale Block-based Compressed Sensing
    Yang Senlin
    Chong Xin
    [J]. PROCEEDINGS OF THE 2015 4TH NATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS AND COMPUTER ENGINEERING ( NCEECE 2015), 2016, 47 : 1557 - 1560
  • [34] GLCM: Global-Local Captioning Model for Remote Sensing Image Captioning
    Wang, Qi
    Huang, Wei
    Zhang, Xueting
    Li, Xuelong
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (11) : 6910 - 6922
  • [35] Random Topology and Random Multiscale Mapping: An Automated Design of Multiscale and Lightweight Neural Network for Remote-Sensing Image Recognition
    Li, Jihao
    Weinmann, Martin
    Sun, Xian
    Diao, Wenhui
    Feng, Yingchao
    Fu, Kun
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [36] KEY ISSUES IN IMAGE UNDERSTANDING IN REMOTE-SENSING
    MULLER, JPAL
    [J]. PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 1988, 324 (1579): : 381 - 395
  • [37] PHYSICAL PRINCIPLES OF IMAGE CONVERGENCE IN REMOTE-SENSING
    YERMOLAEV, AG
    KIREYEV, SV
    PYTJEV, YP
    [J]. VESTNIK MOSKOVSKOGO UNIVERSITETA SERIYA 3 FIZIKA ASTRONOMIYA, 1986, 27 (06): : 95 - 97
  • [38] MULTISENSOR IMAGE FUSION TECHNIQUES IN REMOTE-SENSING
    EHLERS, M
    [J]. ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 1991, 46 (01) : 19 - 30
  • [39] MODELING ERRORS IN REMOTE-SENSING IMAGE CLASSIFICATION
    WANG, MH
    HOWARTH, PJ
    [J]. REMOTE SENSING OF ENVIRONMENT, 1993, 45 (03) : 261 - 271
  • [40] Remote-sensing image encryption in hybrid domains
    Zhang, Xiaoqiang
    Zhu, Guiliang
    Ma, Shilong
    [J]. OPTICS COMMUNICATIONS, 2012, 285 (07) : 1736 - 1743