Adversarial Graph Attention Network for Multi-modal Cross-modal Retrieval

被引：2

作者：

Wu, Hongchang ^{[1
]}

Guan, Ziyu ^{[2
]}

Zhi, Tao ^{[3
]}

zhao, Wei ^{[1
]}

Xu, Cai ^{[2
]}

Han, Hong ^{[2
]}

Yang, Yarning ^{[2
]}

机构：

[1] Xidian Univ, Sch Comp Sci & Technol, Xian, Peoples R China

[2] Xidian Univ, Xian, Peoples R China

[3] Xidian Univ, Sch Artificial Intelligence, Xian, Peoples R China

来源：

2019 10TH IEEE INTERNATIONAL CONFERENCE ON BIG KNOWLEDGE (ICBK 2019) | 2019年

关键词：

Cross-modal retrieval; graph attention; self attention; generative adversarial network;

D O I：

10.1109/ICBK.2019.00043

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Existing cross-modal retrieval methods are mainly constrained to the bimodal case. When applied to the multi-modal case, we need to train O(K-2) (K: number of modalities) separate models, which is inefficient and unable to exploit common information among multiple modalities. Though some studies focused on learning a common space of multiple modalities for retrieval, they assumed data to be i.i.d. and failed to learn the underlying semantic structure which could be important for retrieval. To tackle this issue, we propose an extensive Adversarial Graph Attention Network for Multi-modal Cross-modal Retrieval (AGAT). AGAT synthesizes a self-attention network (SAT), a graph attention network (GAT) and a multi-modal generative adversarial network (MGAN). The SAT generates high-level embeddings for data items from different modalities, with self-attention capturing feature-level correlations in each modality. The GAT then uses attention to aggregate embeddings of matched items from different modalities to build a common embedding space. The MGAN aims to "cluster" matched embeddings of different modalities in the common space by forcing them to be similar to the aggregation. Finally, we train the common space so that it captures the semantic structure by constraining within-class/between-class distances. Experiments on three datasets show the effectiveness of AGAT.

引用

页码：265 / 272

页数：8

共 50 条

[41] Cross-modal discriminant adversarial network
Hu, Peng
Peng, Xi
Zhu, Hongyuan
Lin, Jie
Zhen, Liangli
Wang, Wei
Peng, Dezhong
PATTERN RECOGNITION, 2021, 112
[42] Graph Convolutional Network Discrete Hashing for Cross-Modal Retrieval
Bai, Cong
Zeng, Chao
Ma, Qing
Zhang, Jinglin
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (04) : 4756 - 4767
[43] Modality-Fused Graph Network for Cross-Modal Retrieval
Wu, Fei
LI, Shuaishuai
Peng, Guangchuan
Ma, Yongheng
Jing, Xiao-Yuan
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2023, E106D (05) : 1094 - 1097
[44] BCAN: Bidirectional Correct Attention Network for Cross-Modal Retrieval
Liu, Yang
Liu, Hong
Wang, Huaqiu
Meng, Fanyang
Liu, Mengyuan
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (10) : 14247 - 14258
[45] Heterogeneous Attention Network for Effective and Efficient Cross-modal Retrieval
Yu, Tan
Yang, Yi
Li, Yi
Liu, Lin
Fei, Hongliang
Li, Ping
SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1146 - 1156
[46] Text-Enhanced Graph Attention Hashing for Cross-Modal Retrieval
Zou, Qiang
Cheng, Shuli
Du, Anyu
Chen, Jiayi
ENTROPY, 2024, 26 (11)
[47] Csan: cross-coupled semantic adversarial network for cross-modal retrieval
Li, Zhuoyi
Lu, Huibin
Fu, Hao
Meng, Fanzhen
Gu, Guanghua
ARTIFICIAL INTELLIGENCE REVIEW, 2025, 58 (05)
[48] Deep Supervised Dual Cycle Adversarial Network for Cross-Modal Retrieval
Liao, Lei
Yang, Meng
Zhang, Bob
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (02) : 920 - 934
[49] Multi-Level Correlation Adversarial Hashing for Cross-Modal Retrieval
Ma, Xinhong
Zhang, Tianzhu
Xu, Changsheng
IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (12) : 3101 - 3114
[50] Cross-Modal Retrieval with Heterogeneous Graph Embedding
Chen, Dapeng
Wang, Min
Chen, Haobin
Wu, Lin
Qin, Jing
Peng, Wei
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3291 - 3300

← 1 2 3 4 5 →