Unsupervised Deep Relative Neighbor Relationship Preserving Cross-Modal Hashing

被引：1

作者：

Yang, Xiaohan ^{[1
]}

Wang, Zhen ^{[1
,2
]}

Wu, Nannan ^{[1
]}

Li, Guokun ^{[1
]}

Feng, Chuang ^{[1
]}

Liu, Pingping ^{[2
]}

机构：

[1] Shandong Univ Technol, Sch Comp Sci & Technol, Zibo 255000, Peoples R China

[2] Jilin Univ, Key Lab Symbol Computat & Knowledge Engn, Minist Educ, Changchun 130012, Peoples R China

来源：

MATHEMATICS | 2022年 / 10卷 / 15期

基金：

中国国家自然科学基金;

关键词：

cross-modal retrieval; image-text retrieval; cross-modal similarity preserving; hashing algorithm; unsupervised learning; NETWORK; VGG-16;

D O I：

10.3390/math10152644

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

The image-text cross-modal retrieval task, which aims to retrieve the relevant image from text and vice versa, is now attracting widespread attention. To quickly respond to the large-scale task, we propose an Unsupervised Deep Relative Neighbor Relationship Preserving Cross-Modal Hashing (DRNPH) to achieve cross-modal retrieval in the common Hamming space, which has the advantages of storage and efficiency. To fulfill the nearest neighbor search in the Hamming space, we demand to reconstruct both the original intra- and inter-modal neighbor matrix according to the binary feature vectors. Thus, we can compute the neighbor relationship among different modal samples directly based on the Hamming distances. Furthermore, the cross-modal pair-wise similarity preserving constraint requires the similar sample pair have an identical Hamming distance to the anchor. Therefore, the similar sample pairs own the same binary code, and they have minimal Hamming distances. Unfortunately, the pair-wise similarity preserving constraint may lead to an imbalanced code problem. Therefore, we propose the cross-modal triplet relative similarity preserving constraint, which demands the Hamming distances of similar pairs should be less than those of dissimilar pairs to distinguish the samples' ranking orders in the retrieval results. Moreover, a large similarity marginal can boost the algorithm's noise robustness. We conduct the cross-modal retrieval comparative experiments and ablation study on two public datasets, MIRFlickr and NUS-WIDE, respectively. The experimental results show that DRNPH outperforms the state-of-the-art approaches in various image-text retrieval scenarios, and all three proposed constraints are necessary and effective for boosting cross-modal retrieval performance.

引用

页数：17

共 50 条

[31] Unsupervised Cross-Modal Hashing With Modality-Interaction
Tu, Rong-Cheng
Jiang, Jie
Lin, Qinghong
Cai, Chengfei
Tian, Shangxuan
Wang, Hongfa
Liu, Wei
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 5296 - 5308
[32] Robust Unsupervised Cross-modal Hashing for Multimedia Retrieval
Cheng, Miaomiao
Jing, Liping
Ng, Michael K.
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2020, 38 (03)
[33] CLIP4Hashing: Unsupervised Deep Hashing for Cross-Modal Video-Text Retrieval
Zhuo, Yaoxin
Li, Yikang
Hsiao, Jenhao
Ho, Chiuman
Li, Baoxin
PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2022, 2022, : 158 - 166
[34] Generalized Semantic Preserving Hashing for Cross-Modal Retrieval
Mandal, Devraj
Chaudhury, Kunal N.
Biswas, Soma
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (01) : 102 - 112
[35] Cross-modal hashing based on category structure preserving
Dong, Fei
Nie, Xiushan
Liu, Xingbo
Geng, Leilei
Wang, Qian
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2018, 57 : 28 - 33
[36] Deep Binary Reconstruction for Cross-Modal Hashing
Hu, Di
Nie, Feiping
Li, Xuelong
IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (04) : 973 - 985
[37] Deep medical cross-modal attention hashing
Zhang, Yong
Ou, Weihua
Shi, Yufeng
Deng, Jiaxin
You, Xinge
Wang, Anzhi
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2022, 25 (04): : 1519 - 1536
[38] Deep medical cross-modal attention hashing
Yong Zhang
Weihua Ou
Yufeng Shi
Jiaxin Deng
Xinge You
Anzhi Wang
World Wide Web, 2022, 25 : 1519 - 1536
[39] Deep Binary Reconstruction for Cross-modal Hashing
Li, Xuelong
Hu, Di
Nie, Feiping
PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1398 - 1406
[40] Cross-modal hashing with semantic deep embedding
Yan, Cheng
Bai, Xiao
Wang, Shuai
Zhou, Jun
Hancock, Edwin R.
NEUROCOMPUTING, 2019, 337 : 58 - 66

← 1 2 3 4 5 →