Semantic Modeling of Textual Relationships in Cross-modal Retrieval

被引：3

作者：

Yu, Jing ^{[1
]}

Yang, Chenghao ^{[2
]}

Qin, Zengchang ^{[2
]}

Yang, Zhuoqian ^{[2
]}

Hu, Yue ^{[1
]}

Shi, Zhiguo ^{[3
]}

机构：

[1] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China

[2] Beihang Univ, Intelligent Comp & Machine Learning Lab, Beijing, Peoples R China

[3] Univ Sci & Technol Beijing, Sch Comp & Commun Engn, Beijing, Peoples R China

来源：

KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2019, PT I | 2019年 / 11775卷

关键词：

Textual relationships; Relationship integration; Cross-modal retrieval; Knowledge graph; Graph Convolutional Network;

D O I：

10.1007/978-3-030-29551-6_3

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Feature modeling of different modalities is a basic problem in current research of cross-modal information retrieval. Existing models typically project texts and images into one embedding space, in which semantically similar information will have a shorter distance. Semantic modeling of textural relationships is notoriously difficult. In this paper, we propose an approach to model texts using a featured graph by integrating multi-view textual relationships including semantic relationships, statistical co-occurrence, and prior relationships in knowledge base. A dual-path neural network is adopted to learn multi-modal representations of information and cross-modal similarity measure jointly. We use a Graph Convolutional Network (GCN) for generating relation-aware text representations, and use a Convolutional Neural Network (CNN) with non-linearities for image representations. The cross-modal similarity measure is learned by distance metric learning. Experimental results show that, by leveraging the rich relational semantics in texts, our model can outperform the state-of-the-art models by 3.4% on 6.3% in accuracy on two benchmark datasets.

引用

页码：24 / 32

页数：9

共 50 条

[31] Hierarchical Semantic Structure Preserving Hashing for Cross-Modal Retrieval
Wang, Di
Zhang, Caiping
Wang, Quan
Tian, Yumin
He, Lihuo
Zhao, Lin
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 1217 - 1229
[32] Cross-Modal Retrieval Based on Semantic Filtering and Adaptive Pooling
Qiao, Nan
Mao, Junyi
Xie, Hao
Wang, Zhiguo
Yin, Guangqiang
PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND NETWORKS, VOL II, CENET 2023, 2024, 1126 : 296 - 310
[33] Deep semantic hashing with dual attention for cross-modal retrieval
Wu, Jiagao
Weng, Weiwei
Fu, Junxia
Liu, Linfeng
Hu, Bin
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (07): : 5397 - 5416
[34] Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval
Song, Yale
Soleymani, Mohammad
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1979 - 1988
[35] Deep Semantic Correlation with Adversarial Learning for Cross-Modal Retrieval
Hua, Yan
Du, Jianhe
PROCEEDINGS OF 2019 IEEE 9TH INTERNATIONAL CONFERENCE ON ELECTRONICS INFORMATION AND EMERGENCY COMMUNICATION (ICEIEC 2019), 2019, : 252 - 255
[36] Discrete Semantic Matrix Factorization Hashing for Cross-Modal Retrieval
Qin, Jianyang
Fei, Lunke
Teng, Shaohua
Zhang, Wei
Liu, Dongning
Zhao, Genping
Yuan, Haoliang
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 1550 - 1557
[37] Semantic Constraints Matrix Factorization Hashing for cross-modal retrieval
Li, Weian
Xiong, Haixia
Ou, Weihua
Gou, Jianping
Deng, Jiaxing
Liang, Linqing
Zhou, Quan
COMPUTERS & ELECTRICAL ENGINEERING, 2022, 100
[38] Deep Visual-Semantic Hashing for Cross-Modal Retrieval
Cao, Yue
Long, Mingsheng
Wang, Jianmin
Yang, Qiang
Yu, Philip S.
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 1445 - 1454
[39] Modal-adversarial Semantic Learning Network for Extendable Cross-modal Retrieval
Xu, Xing
Song, Jingkuan
Lu, Huimin
Yang, Yang
Shen, Fumin
Huang, Zi
ICMR '18: PROCEEDINGS OF THE 2018 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2018, : 46 - 54
[40] Csan: cross-coupled semantic adversarial network for cross-modal retrieval
Li, Zhuoyi
Lu, Huibin
Fu, Hao
Meng, Fanzhen
Gu, Guanghua
ARTIFICIAL INTELLIGENCE REVIEW, 2025, 58 (05)

← 1 2 3 4 5 →