Toward Remote Sensing Image Retrieval Under a Deep Image Captioning Perspective

被引：48

作者：

Hoxha, Genc ^{[1
]}

Melgani, Farid ^{[1
]}

Demir, Begum ^{[2
]}

机构：

[1] Univ Trento, Dept Informat Engn & Comp Sci, I-38123 Trento, Italy

[2] Tech Univ Berlin, Fac Elect Engn & Comp Sci, D-10623 Berlin, Germany

来源：

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING | 2020年 / 13卷

基金：

欧洲研究理事会;

关键词：

Visualization; Image retrieval; Feature extraction; Semantics; Integrated circuits; Recurrent neural networks; Remote sensing; Convolutional neural network; deep learning; image captioning; image retrieval; recurrent neural network; remote sensing; semantic gap; GRAPH; MODELS;

D O I：

10.1109/JSTARS.2020.3013818

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The performance of remote sensing image retrieval (RSIR) systems depends on the capability of the extracted features in characterizing the semantic content of images. Existing RSIR systems describe images by visual descriptors that model the primitives (such as different land-cover classes) present in the images. However, the visual descriptors may not be sufficient to describe the high-level complex content of RS images (e.g., attributes and relationships among different land-cover classes). To address this issue, in this article, we present an RSIR system that aims at generating and exploiting textual descriptions to accurately describe the relationships between the objects and their attributes present in RS images with captions (i.e., sentences). To this end, the proposed retrieval system consists of three main steps. The first step aims to encode the image visual features and then translate the encoded features into a textual description that summarizes the content of the image with captions. This is achieved based on the combination of a convolutional neural network with a recurrent neural network. The second step aims to convert the generated textual descriptions into semantically meaningful feature vectors. This is achieved by using the recent word embedding techniques. Finally, the last step estimates the similarity between the vectors of the textual descriptions of the query image and those of the archive images, and then retrieve the most similar images to the query image. Experimental results obtained on two different datasets show that the description of the image content with captions in the framework of RSIR leads to an accurate retrieval performance.

引用

页码：4462 / 4475

页数：14

共 50 条

[31] GLCM: Global-Local Captioning Model for Remote Sensing Image Captioning
Wang, Qi
Huang, Wei
Zhang, Xueting
Li, Xuelong
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (11) : 6910 - 6922
[32] Deep Image Captioning Survey: A Resource Availability Perspective
Al Sulaimi, Mousa
Ahmad, Imtiaz
Jeragh, Mohammad
[J]. PROCEEDINGS OF THE 2021 29TH CONFERENCE OF OPEN INNOVATIONS ASSOCIATION (FRUCT), VOL 1, 2021, : 3 - 13
[33] Deep Captioning Hashing Network for Complex Scene Image Retrieval
Liu, Song
Zhan, Jiawei
Luo, Zhengding
Qi, Gege
Bai, Zhiqiang
Zhu, Yuesheng
[J]. 2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 300 - 307
[34] Region-guided transformer for remote sensing image captioning
Zhao, Kai
Xiong, Wei
[J]. INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2024, 17 (01)
[35] REMOTE SENSING IMAGE CAPTIONING WITH SVM-BASED DECODING
Hoxha, Genc
Melgani, Farid
[J]. IGARSS 2020 - 2020 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2020, : 6734 - 6737
[36] Exploring Transformer and Multilabel Classification for Remote Sensing Image Captioning
Kandala, Hitesh
Saha, Sudipan
Banerjee, Biplab
Zhu, Xiao Xiang
[J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[37] Sound Active Attention Framework for Remote Sensing Image Captioning
Lu, Xiaoqiang
Wang, Binqiang
Zheng, Xiangtao
[J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2020, 58 (03): : 1985 - 2000
[38] Truncation Cross Entropy Loss for Remote Sensing Image Captioning
Li, Xuelong
Zhang, Xueting
Huang, Wei
Wang, Qi
[J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2021, 59 (06): : 5246 - 5257
[39] Multiscale Methods for Optical Remote-Sensing Image Captioning
Ma, Xiaofeng
Zhao, Rui
Shi, Zhenwei
[J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2021, 18 (11) : 2001 - 2005
[40] Recurrent Attention and Semantic Gate for Remote Sensing Image Captioning
Li, Yunpeng
Zhang, Xiangrong
Gu, Jing
Li, Chen
Wang, Xin
Tang, Xu
Jiao, Licheng
[J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60

← 1 2 3 4 5 →