An Unsupervised Sampling Approach for Image-Sentence Matching Using Document-Level Structural Information

被引:0
|
作者
Li, Zejun [1 ]
Wei, Zhongyu [1 ,4 ]
Fan, Zhihao [1 ]
Shan, Haijun [2 ]
Huang, Xuanjing [3 ]
机构
[1] Fudan Univ, Sch Data Sci, Shanghai, Peoples R China
[2] Zhejiang Lab, Hangzhou, Peoples R China
[3] Fudan Univ, Sch Comp Sci, Shanghai, Peoples R China
[4] Fudan Univ, Res Inst Intelligent & Complex Syst, Shanghai, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we focus on the problem of unsupervised image-sentence matching. Existing research explores to utilize document-level structural information to sample positive and negative instances for model training. Although the approach achieves positive results, it introduces a sampling bias and fails to distinguish instances with high semantic similarity. To alleviate the bias, we propose a new sampling strategy to select additional intra-document image-sentence pairs as positive or negative samples. Furthermore, to recognize the complex pattern in intra-document samples, we propose a Transformer based model to capture fine-grained features and implicitly construct a graph for each document, where concepts in a document are introduced to bridge the representation learning of images and sentences in the context of a document. Experimental results show the effectiveness of our approach to alleviate the bias and learn well-aligned multimodal representations.
引用
收藏
页码:13324 / 13332
页数:9
相关论文
共 16 条
  • [1] Document-level Keyphrase Extraction Approach using Neighborhood Knowledge
    Li C.-L.
    Long J.-H.
    Tang Z.-L.
    Zhou T.
    [J]. Dianzi Keji Daxue Xuebao/Journal of the University of Electronic Science and Technology of China, 2021, 50 (04): : 551 - 557
  • [2] Document-level sentiment classification using hybrid machine learning approach
    Abinash Tripathy
    Abhishek Anand
    Santanu Kumar Rath
    [J]. Knowledge and Information Systems, 2017, 53 : 805 - 831
  • [3] Document-level sentiment classification using hybrid machine learning approach
    Tripathy, Abinash
    Anand, Abhishek
    Rath, Santanu Kumar
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2017, 53 (03) : 805 - 831
  • [4] Document image matching using a maximal grid approach
    Tzacheva, A
    El-Sonbaty, Y
    El-Kwae, EA
    [J]. DOCUMENT RECOGNITION AND RETRIEVAL IX, 2002, 4670 : 121 - 128
  • [5] A Document-Level Sentiment Analysis Approach Using Artificial Neural Network and Sentiment Lexicons
    Sharma, Anuj
    Dey, Shubhamoy
    [J]. APPLIED COMPUTING REVIEW, 2012, 12 (04): : 67 - 75
  • [6] Retrieval Of Information In Document Image Databases Using Partial Word Image Matching Technique
    Yadav, Seema
    Sawarkar, Sudhir
    [J]. IMECS 2009: INTERNATIONAL MULTI-CONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS, VOLS I AND II, 2009, : 902 - +
  • [7] Retrieval Of Information In Document Image Databases Using Partial Word Image Matching Technique
    Yadav, Seema
    Sawarkar, Sudhir
    [J]. 2009 IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE, VOLS 1-3, 2009, : 552 - 557
  • [8] Bi-level document image compression using layout information
    Inglis, SJ
    Witten, IH
    [J]. DCC '96 - DATA COMPRESSION CONFERENCE, PROCEEDINGS, 1996, : 442 - 442
  • [9] Keyword spotting on Hangul document images using two-level image-to-image matching
    Park, SC
    Son, HJ
    Jeong, CB
    Kim, SH
    [J]. INNOVATIONS IN APPLIED ARTIFICIAL INTELLIGENCE, 2005, 3533 : 79 - 81
  • [10] Camera-based document image matching using multi-feature probabilistic information fusion
    Roy, Sumantra Dutta
    Bhardwaj, Kavita
    Garg, Rhishabh
    Chaudhury, Santanu
    [J]. PATTERN RECOGNITION LETTERS, 2015, 58 : 42 - 50