Simulation of cross-modal image-text retrieval algorithm under convolutional neural network structure and hash method

被引：2

作者：

Yang, XianBen ^{[1
]}

Zhang, Wei ^{[1
]}

机构：

[1] Beihua Univ, Coll Comp Sci & Technol, Jilin 132000, Jilin, Peoples R China

来源：

JOURNAL OF SUPERCOMPUTING | 2022年 / 78卷 / 05期

关键词：

Convolutional neural network; Hash algorithm; Cross-modal image-text retrieval; Semantic content constraints; Adversarial network;

D O I：

10.1007/s11227-021-04157-w

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

The purpose of this work is to quickly find useful information from the massive image database in view of the images, videos, and other multimedia data generated on Internet platforms, such as Wechat, Sina Weibo and Twitter. The hash algorithm is utilized to map image high-dimensional features into binary code strings to deal with problems of traditional image retrieval algorithms, such as high feature dimension, large storage space, and low retrieval efficiency. Besides, an end-to-end deep hash learning network is designed based on semantic preservation, and the deep Convolutional Neural Network (CNN) is adopted for image feature extraction and hash function learning. Moreover, a binary-constrained regularization term is added to the loss function, and a semantic reservation layer is supplemented to optimize the generation of hash code. Furthermore, a semi-supervised hash learning network is proposed based on Generative Adversarial Network (GAN), which takes the hash network as a discriminator, and a discriminating node is introduced into the output layer to discriminate the true and false samples. For each innovation of the algorithm, retrieval experiments are carried out on several public datasets to further verify the performance of the proposed algorithm to prove the effectiveness. The results reveal that: the loss function of adaptive weight based on hash network reduces the influence of imbalances positive and negative sample on retrieval performance, the constrained regular terms reduce the error caused by quantization, avoids information loss caused by "Relaxation" strategy in traditional methods, and optimizes hash structure. In this way, the obtained hash code can retain semantic similarity and improve the retrieval accuracy. Hash network based on GAN can improve the overall performance of the model by about 3%. Compared with the existing image retrieval algorithm, the deep hash learning algorithm has better effect in image retrieval than the current hash method. The study effectively solves the problems in the image and text information retrieval, and can provide some ideas and methods for the research of the text retrieval.

引用

页码：7106 / 7132

页数：27

共 50 条

[1] RETRACTED ARTICLE: Simulation of cross-modal image-text retrieval algorithm under convolutional neural network structure and hash method
XianBen Yang
Wei Zhang
[J]. The Journal of Supercomputing, 2022, 78 : 7106 - 7132
[2] RETRACTION: Simulation of cross-modal image-text retrieval algorithm under convolutional neural network structure and hash method (Retraction of Vol 78, art no 7106, 2021)
Yang, XianBen
Zhang, Wei
[J]. JOURNAL OF SUPERCOMPUTING, 2024, 80 (09): : 13497 - 13497
[3] RICH: A rapid method for image-text cross-modal hash retrieval
Li, Bo
Yao, Dan
Li, Zhixin
[J]. DISPLAYS, 2023, 79
[4] Cross-modal fabric image-text retrieval based on convolutional neural network and TinyBERT
Xiang, Jun
Zhang, Ning
Pan, Ruru
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (21) : 59725 - 59746
[5] Cross-modal Graph Matching Network for Image-text Retrieval
Cheng, Yuhao
Zhu, Xiaoguang
Qian, Jiuchao
Wen, Fei
Liu, Peilin
[J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2022, 18 (04)
[6] Heterogeneous Graph Fusion Network for cross-modal image-text retrieval
Qin, Xueyang
Li, Lishuang
Pang, Guangyao
Hao, Fei
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
[7] Image-text bidirectional learning network based cross-modal retrieval
Li, Zhuoyi
Lu, Huibin
Fu, Hao
Gu, Guanghua
[J]. NEUROCOMPUTING, 2022, 483 : 148 - 159
[8] Cross-modal Image-Text Retrieval with Multitask Learning
Luo, Junyu
Shen, Ying
Ao, Xiang
Zhao, Zhou
Yang, Min
[J]. PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, : 2309 - 2312
[9] Rethinking Benchmarks for Cross-modal Image-text Retrieval
Chen, Weijing
Yao, Linli
Jin, Qin
[J]. PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 1241 - 1251
[10] Cross-Modal Image-Text Retrieval with Semantic Consistency
Chen, Hui
Ding, Guiguang
Lin, Zijin
Zhao, Sicheng
Han, Jungong
[J]. PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 1749 - 1757

← 1 2 3 4 5 →