A novel approach for image retrieval in remote sensing using vision-language-based image caption generation

被引:0
|
作者
Prem Shanker Yadav [1 ]
Dinesh Kumar Tyagi [1 ]
Santosh Kumar Vipparthi [2 ]
机构
[1] Malaviya National Institute of Technology,Department of Computer Science and Engineering
[2] Indian Institute of Technology,School of Artificial Intelligence and Data Engineering
关键词
Image caption generation; Image retrieval; Remote sensing big data; Vision language pre-training model; TF-IDF;
D O I
10.1007/s11042-024-20447-w
中图分类号
学科分类号
摘要
Recent advancements in satellite technologies have resulted in the emergence of Remote Sensing (RS) images. Hence, the primary imperative research domain is designing a precise retrieval model for retrieving the most pertinent images based on the query. Present Remote Sensing Image Retrieval (RSIR) systems use visual descriptors to characterize the primitives (such as various land-cover types) that are visible in the images. However, the visual descriptors are inadequate for defining the complicated content of RS images. To solve this problem, a new model is devised for image retrieval based on image captions. The goal is to generate textual illustrations with captions to define relations amongst objects precisely. Here, image captioning is attained based on the vision-language pre-training model. The image captions are utilized for generating features like term frequency-inverse document frequency (TF-IDF), length of text, and Bag of Words. Meanwhile, query text is utilized wherein features like TF-IDF, text length, and Bag of Words are obtained. The similarity between query text features and the image captions features has been computed on the basis of a hybrid similarity measure wherein weights are tuned with the proposed Honey Badger Political Optimizer (HBPO) to retrieve the image. The proposed HBPO provided enhanced efficiency with elevated precision of 93.3%, recall of 93.7%, F1-score of 93.5%, and Recall-Oriented Understudy for Gisting Evaluation (ROUGE) of 0.441.
引用
收藏
页码:2985 / 3014
页数:29
相关论文
共 50 条
  • [21] A method of remote sensing image retrieval based on ROI
    Niu, L
    Ni, L
    Lu, W
    Yuan, M
    Third International Conference on Information Technology and Applications, Vol 2, Proceedings, 2005, : 226 - 229
  • [22] Hash-Based Remote Sensing Image Retrieval
    Han, Lirong
    Paoletti, Mercedes E.
    Tao, Xuanwen
    Wu, Zhaoyue
    Haut, Juan M.
    Li, Peng
    Pastor-Vargas, R.
    Plaza, Antonio
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 1
  • [23] Remote Sensing Image Retrieval Based on Attribute Profiles
    Song, Qian
    Huang, Rui
    Wang, Kouzhun
    2015 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND MECHANICAL AUTOMATION (CSMA), 2015, : 231 - 234
  • [24] A Novel Approach for Intellectual Image Retrieval Based on Image Content Using ANN
    Khodaskar, Anuja
    Ladhake, Sidharth
    COMPUTATIONAL INTELLIGENCE IN DATA MINING, VOL 3, 2015, 33
  • [25] A Discriminative Feature Learning Approach for Remote Sensing Image Retrieval
    Xiong, Wei
    Lv, Yafei
    Cui, Yaqi
    Zhang, Xiaohan
    Gu, Xiangqi
    REMOTE SENSING, 2019, 11 (03)
  • [26] Caption and query translation for cross-language image retrieval
    Clough, P
    MULTILINGUAL INFORMATION ACCESS FOR TEXT, SPEECH AND IMAGES, 2005, 3491 : 614 - 625
  • [27] An Approach to Generate a Caption for an Image Collection Using Scene Graph Generation
    Phueaksri, Itthisak
    Kastner, Marc A.
    Kawanishi, Yasutomo
    Komamizu, Takahiro
    Ide, Ichiro
    IEEE ACCESS, 2023, 11 : 128245 - 128260
  • [28] A novel remote sensing image retrieval method based on visual salient point features
    Wang, Xing
    Shao, Zhenfeng
    Zhou, Xiran
    Liu, Jun
    SENSOR REVIEW, 2014, 34 (04) : 349 - 359
  • [29] A NOVEL SEMANTIC ATTRIBUTE-BASED FEATURE FOR IMAGE CAPTION GENERATION
    Wang, Wei
    Ding, Yuxuan
    Tian, Chunna
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 3081 - 3085
  • [30] Image Caption Generation Using A Deep Architecture
    Hani, Ansar
    Tagougui, Najiba
    Kherallah, Monji
    2019 INTERNATIONAL ARAB CONFERENCE ON INFORMATION TECHNOLOGY (ACIT), 2019, : 246 - 251