Scene graph fusion and negative sample generation strategy for image-text matching

被引:0
|
作者
Wang, Liqin [1 ,2 ,3 ]
Yang, Pengcheng [1 ]
Wang, Xu [1 ,2 ,3 ]
Xu, Zhihong [1 ,2 ,3 ]
Dong, Yongfeng [1 ,2 ,3 ]
机构
[1] School of Artificial Intelligence and Data Science, Hebei University of Technology, Tianjin,300401, China
[2] Hebei Province Key Laboratory of Big Data Calculation, Tianjin,300401, China
[3] Hebei Data Driven Industrial Intelligent Engineering Research Center, Tianjin,300401, China
来源
Journal of Supercomputing | 2025年 / 81卷 / 01期
关键词
Semantics;
D O I
10.1007/s11227-024-06652-2
中图分类号
学科分类号
摘要
In the field of image-text matching, the scene graph-based approach is commonly employed to detect semantic associations between entities in cross-modal information, hence improving cross-modal interaction by capturing more fine-grained associations. However, the associations between images and texts are often implicitly modeled, resulting in a semantic gap between image and text information. To address the lack of cross-modal information integration and explicitly model fine-grained semantic information in images and texts, we propose a scene graph fusion and negative sample generation strategy for image-text matching(SGFNS). Furthermore, to enhance the expression ability of the insignificant features of similar images in image-text matching, we propose a negative sample generation strategy, and introduce an extra loss function to effectively incorporate negative samples to enhance the training process. In experiments, we verify the effectiveness of our model compared with current state-of-the-art models using scene graph directly. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.
引用
下载
收藏
相关论文
共 50 条
  • [1] Scene Graph based Fusion Network for Image-Text Retrieval
    Wang, Guoliang
    Shang, Yanlei
    Chen, Yong
    Zhen, Chaoqi
    Cheng, Dequan
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 138 - 143
  • [2] Global-local fusion based on adversarial sample generation for image-text matching
    Huang, Shichen
    Fu, Weina
    Zhang, Zhaoyue
    Liu, Shuai
    INFORMATION FUSION, 2024, 103
  • [3] A Deep Local and Global Scene-Graph Matching for Image-Text Retrieval
    Manh-Duy Nguyen
    Binh T Nguyen
    Cathal Gurrin
    NEW TRENDS IN INTELLIGENT SOFTWARE METHODOLOGIES, TOOLS AND TECHNIQUES, 2021, 337 : 510 - 523
  • [4] Fusion layer attention for image-text matching
    Wang, Depeng
    Wang, Liejun
    Song, Shiji
    Huang, Gao
    Guo, Yuchen
    Cheng, Shuli
    Ao, Naixiang
    Du, Anyu
    NEUROCOMPUTING, 2021, 442 : 249 - 259
  • [5] News Image-Text Matching With News Knowledge Graph
    Zhao Yumeng
    Yun Jing
    Gao Shuo
    Liu Limin
    IEEE ACCESS, 2021, 9 : 108017 - 108027
  • [6] Cross-modal Scene Graph Matching for Relationship-aware Image-Text Retrieval
    Wang, Sijin
    Wang, Ruiping
    Yao, Ziwei
    Shan, Shiguang
    Chen, Xilin
    2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 1497 - 1506
  • [7] Generating counterfactual negative samples for image-text matching
    Su, Xinqi
    Song, Dan
    Li, Wenhui
    Ren, Tongwei
    Liu, An-An
    Information Processing and Management, 2025, 62 (03):
  • [8] Adaptive Latent Graph Representation Learning for Image-Text Matching
    Tian, Mengxiao
    Wu, Xinxiao
    Jia, Yunde
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 (471-482) : 471 - 482
  • [9] Cross Attention Graph Matching Network for Image-Text Retrieval
    Yang, Xiaoyu
    Xie, Hao
    Mao, Junyi
    Wang, Zhiguo
    Yin, Guangqiang
    PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND NETWORKS, VOL II, CENET 2023, 2024, 1126 : 274 - 286
  • [10] Scene Graph Semantic Inference for Image and Text Matching
    Pei, Jiaming
    Zhong, Kaiyang
    Yu, Zhi
    Wang, Lukun
    Lakshmanna, Kuruva
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (05)