Clustering-driven Deep Adversarial Hashing for scalable unsupervised cross-modal retrieval

被引:9
|
作者
Shen, Xiao [1 ]
Zhang, Haofeng [1 ]
Li, Lunbo [1 ]
Zhang, Zheng [2 ]
Chen, Debao [3 ]
Liu, Li [4 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
[2] Harbin Inst Technol, Sch Comp Sci & Technol, Shenzhen 518055, Peoples R China
[3] Huaibei Normal Univ, Sch Comp Sci & Technol, Huaibei 235000, Peoples R China
[4] Incept Inst Artificial Intelligence, Abu Dhabi, U Arab Emirates
基金
中国国家自然科学基金;
关键词
Cross-modal retrieval; Hashing methods; Semantic similarity representation; Clustering algorithms;
D O I
10.1016/j.neucom.2021.06.087
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the advent of the big data era, multimedia data is growing rapidly, and its data modalities is also becoming diversified. Therefore, the demand for the speed and accuracy of cross-modal information retrieval is increasing. Hashing-based cross-modal retrieval technology attracts widespread attention, it encodes multimedia data into a common binary hash space, thereby effectively measuring the correlation between samples from different modalities. In this paper, we propose a novel end-to-end deep cross-modal retrieval framework, namely Clustering-driven Deep Adversarial Hashing (CDAH), which has three main characteristics. Firstly, CDAH learns discriminative clusters recursively through a soft clustering model. It attempts to generate modal-invariant representations in a common space by obfuscating the modality classifier, which tries to distinguish different modalities according to the generated representations. Secondly, in order to minimize the modal gap between feature representations from different modalities with the same semantic label, and to maximize the distance between images and texts with different labels, CDAH constructs a fused-semantics matrix to integrate the original domain information from different modalities, serving as self-supervised information to refine the binary codes. Finally, CDAH skillfully uses a scaled tanh function to adaptively learn the binary codes, which will gradually converge to the original tricky binary coding problem. We conduct comprehensive experiments on four popular datasets, and the experimental results demonstrate the superiority of our model against the state-of-the-art methods. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:152 / 164
页数:13
相关论文
共 50 条
  • [21] Deep noise mitigation and semantic reconstruction hashing for unsupervised cross-modal retrieval
    Zhang, Cheng
    Wan, Yuan
    Qiang, Haopeng
    [J]. NEURAL COMPUTING & APPLICATIONS, 2024, 36 (10): : 5383 - 5397
  • [22] Deep Semantic-Preserving Reconstruction Hashing for Unsupervised Cross-Modal Retrieval
    Cheng, Shuli
    Wang, Liejun
    Du, Anyu
    [J]. ENTROPY, 2020, 22 (11) : 1 - 22
  • [23] UNSUPERVISED CROSS-MODAL RETRIEVAL THROUGH ADVERSARIAL LEARNING
    He, Li
    Xu, Xing
    Lu, Huimin
    Yang, Yang
    Shen, Fumin
    Shen, Heng Tao
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 1153 - 1158
  • [24] UNSUPERVISED CONTRASTIVE HASHING FOR CROSS-MODAL RETRIEVAL IN REMOTE SENSING
    Mikriukov, Georgii
    Ravanbakhsh, Mahdyar
    Demir, Begum
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4463 - 4467
  • [25] Adversarial Projection Learning Based Hashing for Cross-Modal Retrieval
    Zeng, Chao
    Bai, Cong
    Ma, Qing
    Chen, Shengyong
    [J]. Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2021, 33 (06): : 904 - 912
  • [26] Semantic-rebased cross-modal hashing for scalable unsupervised text-visual retrieval
    Wang, Weiwei
    Shen, Yuming
    Zhang, Haofeng
    Liu, Li
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (06)
  • [27] CLIP4Hashing: Unsupervised Deep Hashing for Cross-Modal Video-Text Retrieval
    Zhuo, Yaoxin
    Li, Yikang
    Hsiao, Jenhao
    Ho, Chiuman
    Li, Baoxin
    [J]. PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2022, 2022, : 158 - 166
  • [28] Coupled CycleGAN: Unsupervised Hashing Network for Cross-Modal Retrieval
    Li, Chao
    Deng, Cheng
    Wang, Lei
    Xie, De
    Liu, Xianglong
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 176 - 183
  • [29] Discrete semantic embedding hashing for scalable cross-modal retrieval
    Liu, Junjie
    Fei, Lunke
    Jia, Wei
    Zhao, Shuping
    Wen, Jie
    Teng, Shaohua
    Zhang, Wei
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 1461 - 1467
  • [30] Supervised Hierarchical Deep Hashing for Cross-Modal Retrieval
    Zhan, Yu-Wei
    Luo, Xin
    Wang, Yongxin
    Xu, Xin-Shun
    [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 3386 - 3394