Unsupervised Deep Relative Neighbor Relationship Preserving Cross-Modal Hashing

被引:1
|
作者
Yang, Xiaohan [1 ]
Wang, Zhen [1 ,2 ]
Wu, Nannan [1 ]
Li, Guokun [1 ]
Feng, Chuang [1 ]
Liu, Pingping [2 ]
机构
[1] Shandong Univ Technol, Sch Comp Sci & Technol, Zibo 255000, Peoples R China
[2] Jilin Univ, Key Lab Symbol Computat & Knowledge Engn, Minist Educ, Changchun 130012, Peoples R China
基金
中国国家自然科学基金;
关键词
cross-modal retrieval; image-text retrieval; cross-modal similarity preserving; hashing algorithm; unsupervised learning; NETWORK; VGG-16;
D O I
10.3390/math10152644
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
The image-text cross-modal retrieval task, which aims to retrieve the relevant image from text and vice versa, is now attracting widespread attention. To quickly respond to the large-scale task, we propose an Unsupervised Deep Relative Neighbor Relationship Preserving Cross-Modal Hashing (DRNPH) to achieve cross-modal retrieval in the common Hamming space, which has the advantages of storage and efficiency. To fulfill the nearest neighbor search in the Hamming space, we demand to reconstruct both the original intra- and inter-modal neighbor matrix according to the binary feature vectors. Thus, we can compute the neighbor relationship among different modal samples directly based on the Hamming distances. Furthermore, the cross-modal pair-wise similarity preserving constraint requires the similar sample pair have an identical Hamming distance to the anchor. Therefore, the similar sample pairs own the same binary code, and they have minimal Hamming distances. Unfortunately, the pair-wise similarity preserving constraint may lead to an imbalanced code problem. Therefore, we propose the cross-modal triplet relative similarity preserving constraint, which demands the Hamming distances of similar pairs should be less than those of dissimilar pairs to distinguish the samples' ranking orders in the retrieval results. Moreover, a large similarity marginal can boost the algorithm's noise robustness. We conduct the cross-modal retrieval comparative experiments and ablation study on two public datasets, MIRFlickr and NUS-WIDE, respectively. The experimental results show that DRNPH outperforms the state-of-the-art approaches in various image-text retrieval scenarios, and all three proposed constraints are necessary and effective for boosting cross-modal retrieval performance.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] Unsupervised Cross-Modal Hashing With Modality-Interaction
    Tu, Rong-Cheng
    Jiang, Jie
    Lin, Qinghong
    Cai, Chengfei
    Tian, Shangxuan
    Wang, Hongfa
    Liu, Wei
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 5296 - 5308
  • [32] Robust Unsupervised Cross-modal Hashing for Multimedia Retrieval
    Cheng, Miaomiao
    Jing, Liping
    Ng, Michael K.
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2020, 38 (03)
  • [33] CLIP4Hashing: Unsupervised Deep Hashing for Cross-Modal Video-Text Retrieval
    Zhuo, Yaoxin
    Li, Yikang
    Hsiao, Jenhao
    Ho, Chiuman
    Li, Baoxin
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2022, 2022, : 158 - 166
  • [34] Generalized Semantic Preserving Hashing for Cross-Modal Retrieval
    Mandal, Devraj
    Chaudhury, Kunal N.
    Biswas, Soma
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (01) : 102 - 112
  • [35] Cross-modal hashing based on category structure preserving
    Dong, Fei
    Nie, Xiushan
    Liu, Xingbo
    Geng, Leilei
    Wang, Qian
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2018, 57 : 28 - 33
  • [36] Deep Binary Reconstruction for Cross-Modal Hashing
    Hu, Di
    Nie, Feiping
    Li, Xuelong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (04) : 973 - 985
  • [37] Deep medical cross-modal attention hashing
    Zhang, Yong
    Ou, Weihua
    Shi, Yufeng
    Deng, Jiaxin
    You, Xinge
    Wang, Anzhi
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2022, 25 (04): : 1519 - 1536
  • [38] Deep medical cross-modal attention hashing
    Yong Zhang
    Weihua Ou
    Yufeng Shi
    Jiaxin Deng
    Xinge You
    Anzhi Wang
    World Wide Web, 2022, 25 : 1519 - 1536
  • [39] Deep Binary Reconstruction for Cross-modal Hashing
    Li, Xuelong
    Hu, Di
    Nie, Feiping
    PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1398 - 1406
  • [40] Cross-modal hashing with semantic deep embedding
    Yan, Cheng
    Bai, Xiao
    Wang, Shuai
    Zhou, Jun
    Hancock, Edwin R.
    NEUROCOMPUTING, 2019, 337 : 58 - 66