Semantic Modeling of Textual Relationships in Cross-modal Retrieval

被引:3
|
作者
Yu, Jing [1 ]
Yang, Chenghao [2 ]
Qin, Zengchang [2 ]
Yang, Zhuoqian [2 ]
Hu, Yue [1 ]
Shi, Zhiguo [3 ]
机构
[1] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
[2] Beihang Univ, Intelligent Comp & Machine Learning Lab, Beijing, Peoples R China
[3] Univ Sci & Technol Beijing, Sch Comp & Commun Engn, Beijing, Peoples R China
关键词
Textual relationships; Relationship integration; Cross-modal retrieval; Knowledge graph; Graph Convolutional Network;
D O I
10.1007/978-3-030-29551-6_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature modeling of different modalities is a basic problem in current research of cross-modal information retrieval. Existing models typically project texts and images into one embedding space, in which semantically similar information will have a shorter distance. Semantic modeling of textural relationships is notoriously difficult. In this paper, we propose an approach to model texts using a featured graph by integrating multi-view textual relationships including semantic relationships, statistical co-occurrence, and prior relationships in knowledge base. A dual-path neural network is adopted to learn multi-modal representations of information and cross-modal similarity measure jointly. We use a Graph Convolutional Network (GCN) for generating relation-aware text representations, and use a Convolutional Neural Network (CNN) with non-linearities for image representations. The cross-modal similarity measure is learned by distance metric learning. Experimental results show that, by leveraging the rich relational semantics in texts, our model can outperform the state-of-the-art models by 3.4% on 6.3% in accuracy on two benchmark datasets.
引用
收藏
页码:24 / 32
页数:9
相关论文
共 50 条
  • [1] Automatic semantic modeling of structured data sources with cross-modal retrieval
    Xu, Ruiqing
    Mayer, Wolfgang
    Chu, Hailong
    Zhang, Yitao
    Zhang, Hong-Yu
    Wang, Yulong
    Liu, Youfa
    Feng, Zaiwen
    PATTERN RECOGNITION LETTERS, 2024, 177 : 7 - 14
  • [2] Deep Semantic Mapping for Cross-Modal Retrieval
    Wang, Cheng
    Yang, Haojin
    Meinel, Christoph
    2015 IEEE 27TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2015), 2015, : 234 - 241
  • [3] Semantic consistency hashing for cross-modal retrieval
    Yao, Tao
    Kong, Xiangwei
    Fu, Haiyan
    Tian, Qi
    NEUROCOMPUTING, 2016, 193 : 250 - 259
  • [4] Analyzing semantic correlation for cross-modal retrieval
    Liang Xie
    Peng Pan
    Yansheng Lu
    Multimedia Systems, 2015, 21 : 525 - 539
  • [5] Analyzing semantic correlation for cross-modal retrieval
    Xie, Liang
    Pan, Peng
    Lu, Yansheng
    MULTIMEDIA SYSTEMS, 2015, 21 (06) : 525 - 539
  • [6] Multi-modal semantic autoencoder for cross-modal retrieval
    Wu, Yiling
    Wang, Shuhui
    Huang, Qingming
    NEUROCOMPUTING, 2019, 331 : 165 - 175
  • [7] Generalized Semantic Preserving Hashing for Cross-Modal Retrieval
    Mandal, Devraj
    Chaudhury, Kunal N.
    Biswas, Soma
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (01) : 102 - 112
  • [8] A Scalable Architecture for Cross-Modal Semantic Annotation and Retrieval
    Moeller, Manuel
    Sintek, Michael
    KI 2008: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2008, 5243 : 391 - 392
  • [9] Semantic-Guided Hashing for Cross-Modal Retrieval
    Chen, Zhikui
    Du, Jianing
    Zhong, Fangming
    Chen, Shi
    2019 IEEE FIFTH INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (IEEE BIGDATASERVICE 2019), 2019, : 182 - 190
  • [10] Semantic ranking structure preserving for cross-modal retrieval
    Liu, Hui
    Feng, Yong
    Zhou, Mingliang
    Qiang, Baohua
    APPLIED INTELLIGENCE, 2021, 51 (03) : 1802 - 1812