Modeling Text with Graph Convolutional Network for Cross-Modal Information Retrieval

被引：29

作者：

Yu, Jing ^{[1
,2
]}

Lu, Yuhang ^{[1
,2
]}

Qin, Zengchang ^{[3
]}

Zhang, Weifeng ^{[4
,5
]}

Liu, Yanbing ^{[1
]}

Tan, Jianlong ^{[1
]}

Guo, Li ^{[1
]}

机构：

[1] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Sch Cyber Secur, Beijing, Peoples R China

[3] Beihang Univ, Sch ASEE, Intelligent Comp & Machine Learning Lab, Beijing, Peoples R China

[4] Hangzhou Dianzi Univ, Hangzhou, Zhejiang, Peoples R China

[5] Zhejiang Future Technol Inst, Jiaxing, Peoples R China

来源：

ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT I | 2018年 / 11164卷

关键词：

D O I：

10.1007/978-3-030-00776-8_21

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Cross-modal information retrieval aims to find heterogeneous data of various modalities from a given query of one modality. The main challenge is to map different modalities into a common semantic space, in which distance between concepts in different modalities can be well modeled. For cross-modal information retrieval between images and texts, existing work mostly uses off-the-shelf Convolutional Neural Network (CNN) for image feature extraction. For texts, word-level features such as bag-of-words or word2vec are employed to build deep learning models to represent texts. Besides word-level semantics, the semantic relations between words are also informative but less explored. In this paper, we model texts by graphs using similarity measure based on word2vec. A dual-path neural network model is proposed for couple feature learning in cross-modal information retrieval. One path utilizes Graph Convolutional Network (GCN) for text modeling based on graph representations. The other path uses a neural network with layers of nonlinearities for image modeling based on off-the-shelf features. The model is trained by a pairwise similarity loss function to maximize the similarity of relevant text-image pairs and minimize the similarity of irrelevant pairs. Experimental results show that the proposed model outperforms the state-of-the-art methods significantly, with 17% improvement on accuracy for the best case.

引用

下载

页码：223 / 234

页数：12

共 50 条

[1] Graph Convolutional Network Hashing for Cross-Modal Retrieval
Xu, Ruiqing
Li, Chao
Yan, Junchi
Deng, Cheng
Liu, Xianglong
PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 982 - 988
[2] Adversarial Graph Convolutional Network for Cross-Modal Retrieval
Dong, Xinfeng
Liu, Li
Zhu, Lei
Nie, Liqiang
Zhang, Huaxiang
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (03) : 1634 - 1645
[3] Graph Convolutional Network Discrete Hashing for Cross-Modal Retrieval
Bai, Cong
Zeng, Chao
Ma, Qing
Zhang, Jinglin
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (04) : 4756 - 4767
[4] Cross-modal Graph Matching Network for Image-text Retrieval
Cheng, Yuhao
Zhu, Xiaoguang
Qian, Jiuchao
Wen, Fei
Liu, Peilin
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2022, 18 (04)
[5] RETRACTED: Graph Convolutional Networks for Cross-Modal Information Retrieval (Retracted Article)
Yang, Xianben
Zhang, Wei
WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
[6] Cross-Modal Information Interaction Reasoning Network for Image and Text Retrieval
Wei, Yuqi
Li, Ning
Computer Engineering and Applications, 2023, 59 (16) : 115 - 124
[7] Semi-supervised constrained graph convolutional network for cross-modal retrieval
Zhang, Lei
Chen, Leiting
Ou, Weihua
Zhou, Chuan
COMPUTERS & ELECTRICAL ENGINEERING, 2022, 101
[8] Heterogeneous Graph Fusion Network for cross-modal image-text retrieval
Qin, Xueyang
Li, Lishuang
Pang, Guangyao
Hao, Fei
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
[9] Graph Embedding Learning for Cross-Modal Information Retrieval
Zhang, Youcai
Gu, Xiaodong
NEURAL INFORMATION PROCESSING (ICONIP 2017), PT III, 2017, 10636 : 594 - 601
[10] Joint-Modal Graph Convolutional Hashing for unsupervised cross-modal retrieval
Meng, Hui
Zhang, Huaxiang
Liu, Li
Liu, Dongmei
Lu, Xu
Guo, Xinru
NEUROCOMPUTING, 2024, 595

← 1 2 3 4 5 →