Simulation of cross-modal image-text retrieval algorithm under convolutional neural network structure and hash method

被引:2
|
作者
Yang, XianBen [1 ]
Zhang, Wei [1 ]
机构
[1] Beihua Univ, Coll Comp Sci & Technol, Jilin 132000, Jilin, Peoples R China
来源
JOURNAL OF SUPERCOMPUTING | 2022年 / 78卷 / 05期
关键词
Convolutional neural network; Hash algorithm; Cross-modal image-text retrieval; Semantic content constraints; Adversarial network;
D O I
10.1007/s11227-021-04157-w
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The purpose of this work is to quickly find useful information from the massive image database in view of the images, videos, and other multimedia data generated on Internet platforms, such as Wechat, Sina Weibo and Twitter. The hash algorithm is utilized to map image high-dimensional features into binary code strings to deal with problems of traditional image retrieval algorithms, such as high feature dimension, large storage space, and low retrieval efficiency. Besides, an end-to-end deep hash learning network is designed based on semantic preservation, and the deep Convolutional Neural Network (CNN) is adopted for image feature extraction and hash function learning. Moreover, a binary-constrained regularization term is added to the loss function, and a semantic reservation layer is supplemented to optimize the generation of hash code. Furthermore, a semi-supervised hash learning network is proposed based on Generative Adversarial Network (GAN), which takes the hash network as a discriminator, and a discriminating node is introduced into the output layer to discriminate the true and false samples. For each innovation of the algorithm, retrieval experiments are carried out on several public datasets to further verify the performance of the proposed algorithm to prove the effectiveness. The results reveal that: the loss function of adaptive weight based on hash network reduces the influence of imbalances positive and negative sample on retrieval performance, the constrained regular terms reduce the error caused by quantization, avoids information loss caused by "Relaxation" strategy in traditional methods, and optimizes hash structure. In this way, the obtained hash code can retain semantic similarity and improve the retrieval accuracy. Hash network based on GAN can improve the overall performance of the model by about 3%. Compared with the existing image retrieval algorithm, the deep hash learning algorithm has better effect in image retrieval than the current hash method. The study effectively solves the problems in the image and text information retrieval, and can provide some ideas and methods for the research of the text retrieval.
引用
收藏
页码:7106 / 7132
页数:27
相关论文
共 50 条
  • [1] RETRACTED ARTICLE: Simulation of cross-modal image-text retrieval algorithm under convolutional neural network structure and hash method
    XianBen Yang
    Wei Zhang
    [J]. The Journal of Supercomputing, 2022, 78 : 7106 - 7132
  • [2] RETRACTION: Simulation of cross-modal image-text retrieval algorithm under convolutional neural network structure and hash method (Retraction of Vol 78, art no 7106, 2021)
    Yang, XianBen
    Zhang, Wei
    [J]. JOURNAL OF SUPERCOMPUTING, 2024, 80 (09): : 13497 - 13497
  • [3] RICH: A rapid method for image-text cross-modal hash retrieval
    Li, Bo
    Yao, Dan
    Li, Zhixin
    [J]. DISPLAYS, 2023, 79
  • [4] Cross-modal fabric image-text retrieval based on convolutional neural network and TinyBERT
    Xiang, Jun
    Zhang, Ning
    Pan, Ruru
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (21) : 59725 - 59746
  • [5] Cross-modal Graph Matching Network for Image-text Retrieval
    Cheng, Yuhao
    Zhu, Xiaoguang
    Qian, Jiuchao
    Wen, Fei
    Liu, Peilin
    [J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2022, 18 (04)
  • [6] Heterogeneous Graph Fusion Network for cross-modal image-text retrieval
    Qin, Xueyang
    Li, Lishuang
    Pang, Guangyao
    Hao, Fei
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
  • [7] Image-text bidirectional learning network based cross-modal retrieval
    Li, Zhuoyi
    Lu, Huibin
    Fu, Hao
    Gu, Guanghua
    [J]. NEUROCOMPUTING, 2022, 483 : 148 - 159
  • [8] Cross-modal Image-Text Retrieval with Multitask Learning
    Luo, Junyu
    Shen, Ying
    Ao, Xiang
    Zhao, Zhou
    Yang, Min
    [J]. PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, : 2309 - 2312
  • [9] Rethinking Benchmarks for Cross-modal Image-text Retrieval
    Chen, Weijing
    Yao, Linli
    Jin, Qin
    [J]. PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 1241 - 1251
  • [10] Cross-Modal Image-Text Retrieval with Semantic Consistency
    Chen, Hui
    Ding, Guiguang
    Lin, Zijin
    Zhao, Sicheng
    Han, Jungong
    [J]. PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 1749 - 1757