Dual Semantic Relationship Attention Network for Image-Text Matching

被引:0
|
作者
Wen, Keyu [1 ]
Gu, Xiaodong [1 ]
机构
[1] Fudan Univ, Dept Elect Engn, Shanghai 200433, Peoples R China
基金
中国国家自然科学基金;
关键词
cross-modal; retrieval; attention; semantic relationship;
D O I
10.1109/ijcnn48605.2020.9206782
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image-Text Matching is one major task in cross-modal information processing. The main challenge is to learn the unified vision and language representations. Previous methods that perform well on this task primarily focus on the region features in images corresponding to the words in sentences. However, this will cause the regional features to lose contact with the global context, leading to the mismatch with those non-object words in some sentences. In this work, in order to alleviate this problem, a novel Dual Semantic Relationship Attention Network is proposed which mainly consists of two modules, separate semantic relationship module and the joint semantic relationship module. With these two modules, different hierarchies of semantic relationships are learned simultaneously, thus promoting the image-text matching process. Quantitative experiments have been performed on MS-COCO and Flickr-30K and our method outperforms previous approaches by a large margin due to the effectiveness of the dual semantic relationship attention scheme.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] Learning Dual Semantic Relations With Graph Attention for Image-Text Matching
    Wen, Keyu
    Gu, Xiaodong
    Cheng, Qingrong
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (07) : 2866 - 2879
  • [2] Dual-View Semantic Inference Network for image-text matching
    Wu, Chunlei
    Wu, Jie
    Cao, Haiwen
    Wei, Yiwei
    Wang, Leiquan
    [J]. NEUROCOMPUTING, 2021, 426 : 47 - 57
  • [3] Dual Relation-Aware Synergistic Attention Network for Image-Text Matching
    Qi, Shanshan
    Yang, Luxi
    Li, Chunguo
    Huang, Yongming
    [J]. 2022 11TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS (ICCCAS 2022), 2022, : 251 - 256
  • [4] Position Focused Attention Network for Image-Text Matching
    Wang, Yaxiong
    Yang, Hao
    Qian, Xueming
    Ma, Lin
    Lu, Jing
    Li, Biao
    Fan, Xin
    [J]. PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 3792 - 3798
  • [5] Learning Semantic Relationship among Instances for Image-Text Matching
    Fu, Zheren
    Mao, Zhendong
    Song, Yan
    Zhang, Yongdong
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 15159 - 15168
  • [6] Cross-Modal Attention With Semantic Consistence for Image-Text Matching
    Xu, Xing
    Wang, Tan
    Yang, Yang
    Zuo, Lin
    Shen, Fumin
    Shen, Heng Tao
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (12) : 5412 - 5425
  • [7] Rare-aware attention network for image-text matching
    Wang, Yan
    Su, Yuting
    Li, Wenhui
    Sun, Zhengya
    Wei, Zhiqiang
    Nie, Jie
    Li, Xuanya
    Liu, An-An
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (03)
  • [8] Cross Attention Graph Matching Network for Image-Text Retrieval
    Yang, Xiaoyu
    Xie, Hao
    Mao, Junyi
    Wang, Zhiguo
    Yin, Guangqiang
    [J]. PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND NETWORKS, VOL II, CENET 2023, 2024, 1126 : 274 - 286
  • [9] Focus Your Attention: A Bidirectional Focal Attention Network for Image-Text Matching
    Liu, Chunxiao
    Mao, Zhendong
    Liu, An-An
    Zhang, Tianzhu
    Wang, Bin
    Zhang, Yongdong
    [J]. PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 3 - 11
  • [10] Visual Semantic Reasoning for Image-Text Matching
    Li, Kunpeng
    Zhang, Yulun
    Li, Kai
    Li, Yuanyuan
    Fu, Yun
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 4653 - 4661