Entity Semantic Feature Fusion Network for Remote Sensing Image-Text Retrieval

被引:0
|
作者
Shui, Jianan [1 ]
Ding, Shuaipeng [1 ]
Li, Mingyong [1 ]
Ma, Yan [1 ]
机构
[1] Chongqing Normal Univ, Sch Comp & Informat Sci, Chongqing 401331, Peoples R China
来源
关键词
Remote Sensing; Image-Text Retrieval; Entity Semantic;
D O I
10.1007/978-981-97-7244-5_9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, there has been remarkable progress in remote sensing image-text retrieval (RSITR), but in the past RSITR methods, researchers often try to extract features in images and texts from global and local perspectives, and the unique entity semantic contained in remote sensing images and texts rarely paid attention to, or even ignored. In this paper, we propose an Entity Semantic feature Fusion Network (ESFN), which uses the entity semantic in remote sensing images and texts to enhance the alignment degree and improve the retrieval accuracy. In the visual part, we propose a Scene Entity Filtering module (SEF), which can effectively extract significant entity semantic features from low-level feature maps. The Multi-level Adaptive Fusion module (MAF) adaptively selects the information of image features at different levels for feature fusion. In the textual part, we embed the entity semantic in the text into our textual feature extractor, so that it can have a good entity perception of remote sensing text. We designed a Text Phrase Enhancement module (TPE) to further extract and enhance entity semantic and alignment visual information in text. In addition, ESFN's experimental results on RSICD and RSITMD datasets show that R@1 and meanRecall (mR) reach 8.14, 22.16, 18.81 and 37.70 respectively, which verifies the model's perception of entity semantic in remote sensing images and texts. Through performance comparison, ablation study and visualization analysis, the effectiveness and superiority of this method are verified.
引用
收藏
页码:130 / 145
页数:16
相关论文
共 50 条
  • [1] Scale-Semantic Joint Decoupling Network for Image-Text Retrieval in Remote Sensing
    Zheng, Chengyu
    Song, Ning
    Zhang, Ruoyu
    Huang, Lei
    Wei, Zhiqiang
    Nie, Jie
    [J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (01)
  • [2] A Deep Semantic Alignment Network for the Cross-Modal Image-Text Retrieval in Remote Sensing
    Cheng, Qimin
    Zhou, Yuzhuo
    Fu, Peng
    Xu, Yuan
    Zhang, Liang
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2021, 14 : 4284 - 4297
  • [3] Remote sensing image-text retrieval based on layout semantic joint representation
    Zhang R.
    Nie J.
    Song N.
    Zheng C.
    Wei Z.
    [J]. Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2024, 50 (02): : 671 - 683
  • [4] Transcending Fusion: A Multiscale Alignment Method for Remote Sensing Image-Text Retrieval
    Yang, Rui
    Wang, Shuang
    Han, Yingping
    Li, Yuanheng
    Zhao, Dong
    Quan, Dou
    Guo, Yanhe
    Jiao, Licheng
    Yang, Zhi
    [J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62
  • [5] SIRS: Multitask Joint Learning for Remote Sensing Foreground-Entity Image-Text Retrieval
    Zhu, Zicong
    Kang, Jian
    Diao, Wenhui
    Feng, Yingchao
    Li, Junxi
    Ni, Jingen
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 15
  • [6] Visual Global-Salient-Guided Network for Remote Sensing Image-Text Retrieval
    He, Yangpeng
    Xu, Xin
    Chen, Hongjia
    Li, Jinwen
    Pu, Fangling
    [J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62
  • [7] Text-Guided Knowledge Transfer for Remote Sensing Image-Text Retrieval
    Liu, An-An
    Yang, Bo
    Li, Wenhui
    Song, Dan
    Sun, Zhengya
    Ren, Tongwei
    Wei, Zhiqiang
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5
  • [8] TFUN: Trilinear Fusion Network for Ternary Image-Text Retrieval
    Xu, Xing
    Sun, Jialiang
    Cao, Zuo
    Zhang, Yin
    Zhu, Xiaofeng
    Shen, Heng Tao
    [J]. INFORMATION FUSION, 2023, 91 : 327 - 337
  • [9] Scene Graph based Fusion Network for Image-Text Retrieval
    Wang, Guoliang
    Shang, Yanlei
    Chen, Yong
    Zhen, Chaoqi
    Cheng, Dequan
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 138 - 143
  • [10] Direction-Oriented Visual-Semantic Embedding Model for Remote Sensing Image-Text Retrieval
    Ma, Qing
    Pan, Jiancheng
    Bai, Cong
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 1