Semantic Modeling of Textual Relationships in Cross-modal Retrieval

被引：3

作者：

Yu, Jing ^{[1
]}

Yang, Chenghao ^{[2
]}

Qin, Zengchang ^{[2
]}

Yang, Zhuoqian ^{[2
]}

Hu, Yue ^{[1
]}

Shi, Zhiguo ^{[3
]}

机构：

[1] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China

[2] Beihang Univ, Intelligent Comp & Machine Learning Lab, Beijing, Peoples R China

[3] Univ Sci & Technol Beijing, Sch Comp & Commun Engn, Beijing, Peoples R China

来源：

KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2019, PT I | 2019年 / 11775卷

关键词：

Textual relationships; Relationship integration; Cross-modal retrieval; Knowledge graph; Graph Convolutional Network;

D O I：

10.1007/978-3-030-29551-6_3

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Feature modeling of different modalities is a basic problem in current research of cross-modal information retrieval. Existing models typically project texts and images into one embedding space, in which semantically similar information will have a shorter distance. Semantic modeling of textural relationships is notoriously difficult. In this paper, we propose an approach to model texts using a featured graph by integrating multi-view textual relationships including semantic relationships, statistical co-occurrence, and prior relationships in knowledge base. A dual-path neural network is adopted to learn multi-modal representations of information and cross-modal similarity measure jointly. We use a Graph Convolutional Network (GCN) for generating relation-aware text representations, and use a Convolutional Neural Network (CNN) with non-linearities for image representations. The cross-modal similarity measure is learned by distance metric learning. Experimental results show that, by leveraging the rich relational semantics in texts, our model can outperform the state-of-the-art models by 3.4% on 6.3% in accuracy on two benchmark datasets.

引用

页码：24 / 32

页数：9

共 50 条

[1] Automatic semantic modeling of structured data sources with cross-modal retrieval
Xu, Ruiqing
Mayer, Wolfgang
Chu, Hailong
Zhang, Yitao
Zhang, Hong-Yu
Wang, Yulong
Liu, Youfa
Feng, Zaiwen
PATTERN RECOGNITION LETTERS, 2024, 177 : 7 - 14
[2] Deep Semantic Mapping for Cross-Modal Retrieval
Wang, Cheng
Yang, Haojin
Meinel, Christoph
2015 IEEE 27TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2015), 2015, : 234 - 241
[3] Semantic consistency hashing for cross-modal retrieval
Yao, Tao
Kong, Xiangwei
Fu, Haiyan
Tian, Qi
NEUROCOMPUTING, 2016, 193 : 250 - 259
[4] Analyzing semantic correlation for cross-modal retrieval
Liang Xie
Peng Pan
Yansheng Lu
Multimedia Systems, 2015, 21 : 525 - 539
[5] Analyzing semantic correlation for cross-modal retrieval
Xie, Liang
Pan, Peng
Lu, Yansheng
MULTIMEDIA SYSTEMS, 2015, 21 (06) : 525 - 539
[6] Multi-modal semantic autoencoder for cross-modal retrieval
Wu, Yiling
Wang, Shuhui
Huang, Qingming
NEUROCOMPUTING, 2019, 331 : 165 - 175
[7] Generalized Semantic Preserving Hashing for Cross-Modal Retrieval
Mandal, Devraj
Chaudhury, Kunal N.
Biswas, Soma
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (01) : 102 - 112
[8] A Scalable Architecture for Cross-Modal Semantic Annotation and Retrieval
Moeller, Manuel
Sintek, Michael
KI 2008: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2008, 5243 : 391 - 392
[9] Semantic-Guided Hashing for Cross-Modal Retrieval
Chen, Zhikui
Du, Jianing
Zhong, Fangming
Chen, Shi
2019 IEEE FIFTH INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (IEEE BIGDATASERVICE 2019), 2019, : 182 - 190
[10] Semantic ranking structure preserving for cross-modal retrieval
Liu, Hui
Feng, Yong
Zhou, Mingliang
Qiang, Baohua
APPLIED INTELLIGENCE, 2021, 51 (03) : 1802 - 1812

← 1 2 3 4 5 →