Semantic Modeling of Textual Relationships in Cross-modal Retrieval

被引：3

作者：

Yu, Jing ^{[1
]}

Yang, Chenghao ^{[2
]}

Qin, Zengchang ^{[2
]}

Yang, Zhuoqian ^{[2
]}

Hu, Yue ^{[1
]}

Shi, Zhiguo ^{[3
]}

机构：

[1] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China

[2] Beihang Univ, Intelligent Comp & Machine Learning Lab, Beijing, Peoples R China

[3] Univ Sci & Technol Beijing, Sch Comp & Commun Engn, Beijing, Peoples R China

来源：

KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2019, PT I | 2019年 / 11775卷

关键词：

Textual relationships; Relationship integration; Cross-modal retrieval; Knowledge graph; Graph Convolutional Network;

D O I：

10.1007/978-3-030-29551-6_3

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Feature modeling of different modalities is a basic problem in current research of cross-modal information retrieval. Existing models typically project texts and images into one embedding space, in which semantically similar information will have a shorter distance. Semantic modeling of textural relationships is notoriously difficult. In this paper, we propose an approach to model texts using a featured graph by integrating multi-view textual relationships including semantic relationships, statistical co-occurrence, and prior relationships in knowledge base. A dual-path neural network is adopted to learn multi-modal representations of information and cross-modal similarity measure jointly. We use a Graph Convolutional Network (GCN) for generating relation-aware text representations, and use a Convolutional Neural Network (CNN) with non-linearities for image representations. The cross-modal similarity measure is learned by distance metric learning. Experimental results show that, by leveraging the rich relational semantics in texts, our model can outperform the state-of-the-art models by 3.4% on 6.3% in accuracy on two benchmark datasets.

引用

页码：24 / 32

页数：9

共 50 条

[21] Adaptive Marginalized Semantic Hashing for Unpaired Cross-Modal Retrieval
Luo, Kaiyi
Zhang, Chao
Li, Huaxiong
Jia, Xiuyi
Chen, Chunlin
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 9082 - 9095
[22] Deep semantic similarity adversarial hashing for cross-modal retrieval
Qiang, Haopeng
Wan, Yuan
Xiang, Lun
Meng, Xiaojing
NEUROCOMPUTING, 2020, 400 : 24 - 33
[23] Semantic preserving asymmetric discrete hashing for cross-modal retrieval
Fan Yang
Qiao-xi Zhang
Xiao-jian Ding
Fu-min Ma
Jie Cao
De-yu Tong
Applied Intelligence, 2023, 53 : 15352 - 15371
[24] Cross-Modal Image-Text Retrieval with Semantic Consistency
Chen, Hui
Ding, Guiguang
Lin, Zijin
Zhao, Sicheng
Han, Jungong
PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 1749 - 1757
[25] Semantic preserving asymmetric discrete hashing for cross-modal retrieval
Yang, Fan
Zhang, Qiao-xi
Ding, Xiao-jian
Ma, Fu-min
Cao, Jie
Tong, De-yu
APPLIED INTELLIGENCE, 2023, 53 (12) : 15352 - 15371
[26] Discrete semantic embedding hashing for scalable cross-modal retrieval
Liu, Junjie
Fei, Lunke
Jia, Wei
Zhao, Shuping
Wen, Jie
Teng, Shaohua
Zhang, Wei
2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 1461 - 1467
[27] Semantic decomposition and enhancement hashing for deep cross-modal retrieval
Fei, Lunke
He, Zhihao
Wong, Wai Keung
Zhu, Qi
Zhao, Shuping
Wen, Jie
PATTERN RECOGNITION, 2025, 160
[28] Deep supervised multimodal semantic autoencoder for cross-modal retrieval
Tian, Yu
Yang, Wenjing
Liu, Qingsong
Yang, Qiong
COMPUTER ANIMATION AND VIRTUAL WORLDS, 2020, 31 (4-5)
[29] An efficient dual semantic preserving hashing for cross-modal retrieval
Liu, Yun
Ji, Shujuan
Fu, Qiang
Chiu, Dickson K. W.
Gong, Maoguo
NEUROCOMPUTING, 2022, 492 : 264 - 277
[30] ONION: Online Semantic Autoencoder Hashing for Cross-Modal Retrieval
Zhang, Donglin
Wu, Xiao-Jun
Chen, Guoqing
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2023, 14 (02)

← 1 2 3 4 5 →