Visual Semantic Reasoning for Image-Text Matching

被引:350
|
作者
Li, Kunpeng [1 ]
Zhang, Yulun [1 ]
Li, Kai [1 ]
Li, Yuanyuan [1 ]
Fu, Yun [1 ,2 ]
机构
[1] Northeastern Univ, Dept Elect & Comp Engn, Boston, MA 02115 USA
[2] Northeastern Univ, Khoury Coll Comp Sci, Boston, MA 02115 USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/ICCV.2019.00475
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image-text matching has been a hot research topic bridging the vision and language areas. It remains challenging because the current representation of image usually lacks global semantic concepts as in its corresponding text caption. To address this issue, we propose a simple and interpretable reasoning model to generate visual representation that captures key objects and semantic concepts of a scene. Specifically, we first build up connections between image regions and perform reasoning with Graph Convolutional Networks to generate features with semantic relationships. Then, we propose to use the gate and memory mechanism to perform global semantic reasoning on these relationship-enhanced features, select the discriminative information and gradually generate the representation for the whole scene. Experiments validate that our method achieves a new state-of-the-art for the image-text matching on MS-COCO [28] and Flickr30K [40] datasets. It outperforms the current best method by 6.8% relatively for image retrieval and 4.8% relatively for caption retrieval on MS-COCO (Recall@1 using 1K test set). On Flickr30K, our model improves image retrieval by 12.6% relatively and caption retrieval by 5.8% relatively (Recall@1).
引用
收藏
页码:4653 / 4661
页数:9
相关论文
共 50 条
  • [1] Image-Text Embedding Learning via Visual and Textual Semantic Reasoning
    Li, Kunpeng
    Zhang, Yulun
    Li, Kai
    Li, Yuanyuan
    Fu, Yun
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) : 641 - 656
  • [2] VSR plus plus : Improving Visual Semantic Reasoning for Fine-Grained Image-Text Matching
    Yuan, Hui
    Huang, Yan
    Zhang, Dongbo
    Chen, Zerui
    Cheng, Wenlong
    Wang, Liang
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 3728 - 3735
  • [3] Regularizing Visual Semantic Embedding With Contrastive Learning for Image-Text Matching
    Liu, Yang
    Liu, Hong
    Wang, Huaqiu
    Liu, Mengyuan
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 1332 - 1336
  • [4] Similarity Reasoning and Filtration for Image-Text Matching
    Diao, Haiwen
    Zhang, Ying
    Ma, Lin
    Lu, Huchuan
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 1218 - 1226
  • [5] Asymmetric Polysemous Reasoning for Image-Text Matching
    Zhang, Hongping
    Yang, Ming
    [J]. 2023 23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW 2023, 2023, : 1013 - 1022
  • [6] IMAGE-TEXT MATCHING WITH SHARED SEMANTIC CONCEPTS
    Miao Lanxin
    [J]. 2022 19TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2022,
  • [7] Transformer Reasoning Network for Image-Text Matching and Retrieval
    Messina, Nicola
    Falchi, Fabrizio
    Esuli, Andrea
    Amato, Giuseppe
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 5222 - 5229
  • [8] Visual Contextual Semantic Reasoning for Cross-Modal Drone Image-Text Retrieval
    Huang, Jinghao
    Chen, Yaxiong
    Xiong, Shengwu
    Lu, Xiaoqiang
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [9] Learning Semantic Relationship among Instances for Image-Text Matching
    Fu, Zheren
    Mao, Zhendong
    Song, Yan
    Zhang, Yongdong
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 15159 - 15168
  • [10] Dual Semantic Relationship Attention Network for Image-Text Matching
    Wen, Keyu
    Gu, Xiaodong
    [J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,