A Multiview Text Imagination Network Based on Latent Alignment for Image-Text Matching

被引:2
|
作者
Shang, Heng [1 ]
Zhao, Guoshuai [1 ]
Shi, Jing [1 ]
Qian, Xueming [2 ]
机构
[1] Xi An Jiao Tong Univ, Sch Software Engn, Xian 710049, Peoples R China
[2] Xi An Jiao Tong Univ, SMILES Lab, Xian 710049, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Feature extraction; Semantics; Text mining; Intelligent systems; Image representation; Task analysis; Image edge detection;
D O I
10.1109/MIS.2023.3265176
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In image-text matching fields, one of the keys to improving performance is to extract features with more semantic information. Existing works demonstrate that semantic enrichment through knowledge expansion can improve performance. Most of them expand image features, however, the shortage of semantic information in text modality and the unilateral character of the view are often bottlenecks that limit the performance of image-text matching models. To solve the two problems, we aggregate knowledge from multiple views and propose a word imagination graph (WIG). A WIG can be used to expand textual semantic information by imagination based on input images. Then, utilizing WIG, we construct a novel multiview text imagination network (MTIN). A MTIN enables latent alignment of images and texts on tags, which can assist matching on a semantic level. Results from the Flickr30K and MS-COCO datasets demonstrate the effectiveness of our method. The source code has been released on GitHub https://github.com/smileslabsh/Multiview-Text-Imagination-Network.
引用
收藏
页码:41 / 50
页数:10
相关论文
共 50 条
  • [41] Stacked Cross Attention for Image-Text Matching
    Lee, Kuang-Huei
    Chen, Xi
    Hua, Gang
    Hu, Houdong
    He, Xiaodong
    COMPUTER VISION - ECCV 2018, PT IV, 2018, 11208 : 212 - 228
  • [42] IMAGE-TEXT MATCHING WITH SHARED SEMANTIC CONCEPTS
    Miao Lanxin
    2022 19TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2022,
  • [43] Enhancing Separate Encoding with Multi-layer Feature Alignment for Image-Text Matching
    Wen, Keyu
    Li, Linyang
    Gu, Xiaodong
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT I, 2021, 12891 : 403 - 414
  • [44] HIERARCHICAL ATTENTION IMAGE-TEXT ALIGNMENT NETWORK FOR PERSON RE-IDENTIFICATION
    Kansal, Kajal
    Subramanyam, A., V
    Wang, Zheng
    Satoh, Shinichi
    2021 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2021,
  • [45] Blog Article Summarization with Image-Text Alignment Techniques
    Chu, Wei-Ta
    Kao, Ming-Chih
    2017 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2017, : 244 - 247
  • [46] Generating counterfactual negative samples for image-text matching
    Su, Xinqi
    Song, Dan
    Li, Wenhui
    Ren, Tongwei
    Liu, An-An
    Information Processing and Management, 2025, 62 (03):
  • [47] Towards Deconfounded Image-Text Matching with Causal Inference
    Li, Wenhui
    Su, Xinqi
    Song, Dan
    Wang, Lanjun
    Zhang, Kun
    Liu, An-An
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 6264 - 6273
  • [48] A method for image-text matching based on semantic filtering and adaptive adjustment
    Jin, Ran
    Hou, Tengda
    Jin, Tao
    Yuan, Jie
    Du, Chenjie
    EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2024, 2024 (01)
  • [49] Unifying knowledge iterative dissemination and relational reconstruction network for image-text matching
    Xie, Xiumin
    Li, Zhixin
    Tang, Zhenjun
    Yao, Dan
    Ma, Huifang
    INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (01)
  • [50] Multi-Modal Memory Enhancement Attention Network for Image-Text Matching
    Ji, Zhong
    Lin, Zhigang
    Wang, Haoran
    He, Yuqing
    IEEE ACCESS, 2020, 8 : 38438 - 38447