Clustering-driven Deep Adversarial Hashing for scalable unsupervised cross-modal retrieval

被引：9

作者：

Shen, Xiao ^{[1
]}

Zhang, Haofeng ^{[1
]}

Li, Lunbo ^{[1
]}

Zhang, Zheng ^{[2
]}

Chen, Debao ^{[3
]}

Liu, Li ^{[4
]}

机构：

[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China

[2] Harbin Inst Technol, Sch Comp Sci & Technol, Shenzhen 518055, Peoples R China

[3] Huaibei Normal Univ, Sch Comp Sci & Technol, Huaibei 235000, Peoples R China

[4] Incept Inst Artificial Intelligence, Abu Dhabi, U Arab Emirates

来源：

NEUROCOMPUTING | 2021年 / 459卷

基金：

中国国家自然科学基金;

关键词：

Cross-modal retrieval; Hashing methods; Semantic similarity representation; Clustering algorithms;

D O I：

10.1016/j.neucom.2021.06.087

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

With the advent of the big data era, multimedia data is growing rapidly, and its data modalities is also becoming diversified. Therefore, the demand for the speed and accuracy of cross-modal information retrieval is increasing. Hashing-based cross-modal retrieval technology attracts widespread attention, it encodes multimedia data into a common binary hash space, thereby effectively measuring the correlation between samples from different modalities. In this paper, we propose a novel end-to-end deep cross-modal retrieval framework, namely Clustering-driven Deep Adversarial Hashing (CDAH), which has three main characteristics. Firstly, CDAH learns discriminative clusters recursively through a soft clustering model. It attempts to generate modal-invariant representations in a common space by obfuscating the modality classifier, which tries to distinguish different modalities according to the generated representations. Secondly, in order to minimize the modal gap between feature representations from different modalities with the same semantic label, and to maximize the distance between images and texts with different labels, CDAH constructs a fused-semantics matrix to integrate the original domain information from different modalities, serving as self-supervised information to refine the binary codes. Finally, CDAH skillfully uses a scaled tanh function to adaptively learn the binary codes, which will gradually converge to the original tricky binary coding problem. We conduct comprehensive experiments on four popular datasets, and the experimental results demonstrate the superiority of our model against the state-of-the-art methods. (c) 2021 Elsevier B.V. All rights reserved.

引用

页码：152 / 164

页数：13

共 50 条

[21] Deep noise mitigation and semantic reconstruction hashing for unsupervised cross-modal retrieval
Zhang, Cheng
Wan, Yuan
Qiang, Haopeng
[J]. NEURAL COMPUTING & APPLICATIONS, 2024, 36 (10): : 5383 - 5397
[22] Deep Semantic-Preserving Reconstruction Hashing for Unsupervised Cross-Modal Retrieval
Cheng, Shuli
Wang, Liejun
Du, Anyu
[J]. ENTROPY, 2020, 22 (11) : 1 - 22
[23] UNSUPERVISED CROSS-MODAL RETRIEVAL THROUGH ADVERSARIAL LEARNING
He, Li
Xu, Xing
Lu, Huimin
Yang, Yang
Shen, Fumin
Shen, Heng Tao
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 1153 - 1158
[24] UNSUPERVISED CONTRASTIVE HASHING FOR CROSS-MODAL RETRIEVAL IN REMOTE SENSING
Mikriukov, Georgii
Ravanbakhsh, Mahdyar
Demir, Begum
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4463 - 4467
[25] Adversarial Projection Learning Based Hashing for Cross-Modal Retrieval
Zeng, Chao
Bai, Cong
Ma, Qing
Chen, Shengyong
[J]. Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2021, 33 (06): : 904 - 912
[26] Semantic-rebased cross-modal hashing for scalable unsupervised text-visual retrieval
Wang, Weiwei
Shen, Yuming
Zhang, Haofeng
Liu, Li
[J]. INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (06)
[27] CLIP4Hashing: Unsupervised Deep Hashing for Cross-Modal Video-Text Retrieval
Zhuo, Yaoxin
Li, Yikang
Hsiao, Jenhao
Ho, Chiuman
Li, Baoxin
[J]. PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2022, 2022, : 158 - 166
[28] Coupled CycleGAN: Unsupervised Hashing Network for Cross-Modal Retrieval
Li, Chao
Deng, Cheng
Wang, Lei
Xie, De
Liu, Xianglong
[J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 176 - 183
[29] Discrete semantic embedding hashing for scalable cross-modal retrieval
Liu, Junjie
Fei, Lunke
Jia, Wei
Zhao, Shuping
Wen, Jie
Teng, Shaohua
Zhang, Wei
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 1461 - 1467
[30] Supervised Hierarchical Deep Hashing for Cross-Modal Retrieval
Zhan, Yu-Wei
Luo, Xin
Wang, Yongxin
Xu, Xin-Shun
[J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 3386 - 3394

← 1 2 3 4 5 →