A novel approach for image retrieval in remote sensing using vision-language-based image caption generation

被引:0
|
作者
Prem Shanker Yadav [1 ]
Dinesh Kumar Tyagi [1 ]
Santosh Kumar Vipparthi [2 ]
机构
[1] Malaviya National Institute of Technology,Department of Computer Science and Engineering
[2] Indian Institute of Technology,School of Artificial Intelligence and Data Engineering
关键词
Image caption generation; Image retrieval; Remote sensing big data; Vision language pre-training model; TF-IDF;
D O I
10.1007/s11042-024-20447-w
中图分类号
学科分类号
摘要
Recent advancements in satellite technologies have resulted in the emergence of Remote Sensing (RS) images. Hence, the primary imperative research domain is designing a precise retrieval model for retrieving the most pertinent images based on the query. Present Remote Sensing Image Retrieval (RSIR) systems use visual descriptors to characterize the primitives (such as various land-cover types) that are visible in the images. However, the visual descriptors are inadequate for defining the complicated content of RS images. To solve this problem, a new model is devised for image retrieval based on image captions. The goal is to generate textual illustrations with captions to define relations amongst objects precisely. Here, image captioning is attained based on the vision-language pre-training model. The image captions are utilized for generating features like term frequency-inverse document frequency (TF-IDF), length of text, and Bag of Words. Meanwhile, query text is utilized wherein features like TF-IDF, text length, and Bag of Words are obtained. The similarity between query text features and the image captions features has been computed on the basis of a hybrid similarity measure wherein weights are tuned with the proposed Honey Badger Political Optimizer (HBPO) to retrieve the image. The proposed HBPO provided enhanced efficiency with elevated precision of 93.3%, recall of 93.7%, F1-score of 93.5%, and Recall-Oriented Understudy for Gisting Evaluation (ROUGE) of 0.441.
引用
收藏
页码:2985 / 3014
页数:29
相关论文
共 50 条
  • [1] Scene Attention Mechanism for Remote Sensing Image Caption Generation
    Wu, Shiqi
    Zhang, Xiangrong
    Wang, Xin
    Li, Chen
    Jiao, Licheng
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [2] Exploring Models and Data for Remote Sensing Image Caption Generation
    Lu, Xiaoqiang
    Wang, Binqiang
    Zheng, Xiangtao
    Li, Xuelong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2018, 56 (04): : 2183 - 2195
  • [3] Image caption generation via improved vision-language pre-training model: perception towards image retrieval
    Padate, Roshni
    Gupta, Ashutosh
    Kalla, Mukesh
    Sharma, Arvind
    IMAGING SCIENCE JOURNAL, 2025,
  • [4] Remote sensing image caption generation via transformer and reinforcement learning
    Shen, Xiangqing
    Liu, Bing
    Zhou, Yong
    Zhao, Jiaqi
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (35-36) : 26661 - 26682
  • [5] Remote sensing image caption generation via transformer and reinforcement learning
    Xiangqing Shen
    Bing Liu
    Yong Zhou
    Jiaqi Zhao
    Multimedia Tools and Applications, 2020, 79 : 26661 - 26682
  • [6] Automatic Image Caption Generation Using ResNet & Torch Vision
    Verma, Vijeta
    Saritha, Sri Khetwat
    Jain, Sweta
    MACHINE LEARNING, IMAGE PROCESSING, NETWORK SECURITY AND DATA SCIENCES, MIND 2022, PT II, 2022, 1763 : 82 - 101
  • [7] Prior-Experience-Based Vision-Language Model for Remote Sensing Image-Text Retrieval
    Tang, Xu
    Huang, Dabiao
    Ma, Jingjing
    Zhang, Xiangrong
    Liu, Fang
    Jiao, Licheng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [8] A novel retrieval method for remote sensing image based on statistical model
    Zhiqiang Liu
    Ligu Zhu
    Multimedia Tools and Applications, 2018, 77 : 24643 - 24662
  • [9] A novel retrieval method for remote sensing image based on statistical model
    Liu, Zhiqiang
    Zhu, Ligu
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (19) : 24643 - 24662
  • [10] Image Caption Generation via Unified Retrieval and Generation-Based Method
    Zhao, Shanshan
    Li, Lixiang
    Peng, Haipeng
    Yang, Zihang
    Zhang, Jiaxuan
    APPLIED SCIENCES-BASEL, 2020, 10 (18):