Toward Remote Sensing Image Retrieval Under a Deep Image Captioning Perspective

被引:48
|
作者
Hoxha, Genc [1 ]
Melgani, Farid [1 ]
Demir, Begum [2 ]
机构
[1] Univ Trento, Dept Informat Engn & Comp Sci, I-38123 Trento, Italy
[2] Tech Univ Berlin, Fac Elect Engn & Comp Sci, D-10623 Berlin, Germany
基金
欧洲研究理事会;
关键词
Visualization; Image retrieval; Feature extraction; Semantics; Integrated circuits; Recurrent neural networks; Remote sensing; Convolutional neural network; deep learning; image captioning; image retrieval; recurrent neural network; remote sensing; semantic gap; GRAPH; MODELS;
D O I
10.1109/JSTARS.2020.3013818
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The performance of remote sensing image retrieval (RSIR) systems depends on the capability of the extracted features in characterizing the semantic content of images. Existing RSIR systems describe images by visual descriptors that model the primitives (such as different land-cover classes) present in the images. However, the visual descriptors may not be sufficient to describe the high-level complex content of RS images (e.g., attributes and relationships among different land-cover classes). To address this issue, in this article, we present an RSIR system that aims at generating and exploiting textual descriptions to accurately describe the relationships between the objects and their attributes present in RS images with captions (i.e., sentences). To this end, the proposed retrieval system consists of three main steps. The first step aims to encode the image visual features and then translate the encoded features into a textual description that summarizes the content of the image with captions. This is achieved based on the combination of a convolutional neural network with a recurrent neural network. The second step aims to convert the generated textual descriptions into semantically meaningful feature vectors. This is achieved by using the recent word embedding techniques. Finally, the last step estimates the similarity between the vectors of the textual descriptions of the query image and those of the archive images, and then retrieve the most similar images to the query image. Experimental results obtained on two different datasets show that the description of the image content with captions in the framework of RSIR leads to an accurate retrieval performance.
引用
收藏
页码:4462 / 4475
页数:14
相关论文
共 50 条
  • [31] GLCM: Global-Local Captioning Model for Remote Sensing Image Captioning
    Wang, Qi
    Huang, Wei
    Zhang, Xueting
    Li, Xuelong
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (11) : 6910 - 6922
  • [32] Deep Image Captioning Survey: A Resource Availability Perspective
    Al Sulaimi, Mousa
    Ahmad, Imtiaz
    Jeragh, Mohammad
    [J]. PROCEEDINGS OF THE 2021 29TH CONFERENCE OF OPEN INNOVATIONS ASSOCIATION (FRUCT), VOL 1, 2021, : 3 - 13
  • [33] Deep Captioning Hashing Network for Complex Scene Image Retrieval
    Liu, Song
    Zhan, Jiawei
    Luo, Zhengding
    Qi, Gege
    Bai, Zhiqiang
    Zhu, Yuesheng
    [J]. 2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 300 - 307
  • [34] Region-guided transformer for remote sensing image captioning
    Zhao, Kai
    Xiong, Wei
    [J]. INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2024, 17 (01)
  • [35] REMOTE SENSING IMAGE CAPTIONING WITH SVM-BASED DECODING
    Hoxha, Genc
    Melgani, Farid
    [J]. IGARSS 2020 - 2020 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2020, : 6734 - 6737
  • [36] Exploring Transformer and Multilabel Classification for Remote Sensing Image Captioning
    Kandala, Hitesh
    Saha, Sudipan
    Banerjee, Biplab
    Zhu, Xiao Xiang
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [37] Sound Active Attention Framework for Remote Sensing Image Captioning
    Lu, Xiaoqiang
    Wang, Binqiang
    Zheng, Xiangtao
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2020, 58 (03): : 1985 - 2000
  • [38] Truncation Cross Entropy Loss for Remote Sensing Image Captioning
    Li, Xuelong
    Zhang, Xueting
    Huang, Wei
    Wang, Qi
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2021, 59 (06): : 5246 - 5257
  • [39] Multiscale Methods for Optical Remote-Sensing Image Captioning
    Ma, Xiaofeng
    Zhao, Rui
    Shi, Zhenwei
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2021, 18 (11) : 2001 - 2005
  • [40] Recurrent Attention and Semantic Gate for Remote Sensing Image Captioning
    Li, Yunpeng
    Zhang, Xiangrong
    Gu, Jing
    Li, Chen
    Wang, Xin
    Tang, Xu
    Jiao, Licheng
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60