Uniting Image and Text Deep Networks via Bi-directional Triplet Loss for Retreival

被引:0
|
作者
Hua, Yan [1 ]
Du, Jianhe [1 ]
机构
[1] Commun Univ China, Sch Informat & Commun Engn, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep learning; Triplet loss; Bi-direction; Image-text retrieval;
D O I
10.1109/iceiec.2019.8784629
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Image and text are heterogeneous data, thus it is difficult to retrieve images with text query or retrieve texts with image query. Thanks to the success of deep learning in recent years, the feature representations of image and text have been made great advances. However, their distances still cannot be compared directly since they are from different modalities. In this paper, we propose a bi-directional triplet constraint for learning image and text deep networks by simultaneously 1) minimizing the distance of relevant image-text pairwise data, 2) pushing the distance of image vs. its irrelevant text and the distance of text vs. its irrelevant image both larger than that of the pairwise data. Our triplet loss could be seen as cross-modal and bi-directional extension of large margin nearest neighbor method, which is for single-modal data classification. For raw image, a fully-connected subnetwork is designed for image representation learning based on ResNet, and the same architecture is designed for text representation learning. The two deep models are jointly learned with the bi-directional triplet loss in an end-to-end manner. Experiments verify the effectiveness of our proposed model on a widely used dataset.
引用
收藏
页码:297 / 300
页数:4
相关论文
共 50 条
  • [1] Deep Stereo Image Compression via Bi-directional Coding
    Lei, Jianjun
    Liu, Xiangrui
    Peng, Bo
    Jin, Dengchao
    Li, Wanqing
    Gu, Jingxiao
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 19637 - 19646
  • [2] Bi-Directional Spatial-Semantic Attention Networks for Image-Text Matching
    Huang, Feiran
    Zhang, Xiaoming
    Zhao, Zhonghua
    Li, Zhoujun
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (04) : 2008 - 2020
  • [3] Unifying Multimodal Transformer for Bi-directional Image and Text Generation
    Huang, Yupan
    Xue, Hongwei
    Liu, Bei
    Lu, Yutong
    [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1138 - 1147
  • [4] DEEP BI-DIRECTIONAL RECURRENT NETWORKS OVER SPECTRAL WINDOWS
    Mohamed, Abdel-rahman
    Seide, Frank
    Yu, Dong
    Droppo, Jasha
    Stolcke, Andreas
    Zweig, Geoffrey
    Penn, Gerald
    [J]. 2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 78 - 83
  • [5] Bi-directional Image–Text Matching Deep Learning-Based Approaches: Concepts, Methodologies, Benchmarks and Challenges
    Doaa B. Ebaid
    Magda M. Madbouly
    Adel A. El-Zoghabi
    [J]. International Journal of Computational Intelligence Systems, 16
  • [6] Generative Adversarial Networks with Bi-directional Normalization for Semantic Image Synthesis
    Long, Jia
    Lu, Hongtao
    [J]. PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 219 - 226
  • [7] Detecting Sensitive Data Disclosure via Bi-directional Text Correlation Analysis
    Huang, Jianjun
    Zhang, Xiangyu
    Tan, Lin
    [J]. FSE'16: PROCEEDINGS OF THE 2016 24TH ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON FOUNDATIONS OF SOFTWARE ENGINEERING, 2016, : 169 - 180
  • [8] Bi-directional Image-Text Matching Deep Learning-Based Approaches: Concepts, Methodologies, Benchmarks and Challenges
    Ebaid, Doaa B.
    Madbouly, Magda M.
    El-Zoghabi, Adel A.
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2023, 16 (01)
  • [9] Deep Bi-directional Cross-triplet Embedding for Cross-Domain Clothing Retrieval
    Jiang, Shuhui
    Wu, Yue
    Fu, Yun
    [J]. MM'16: PROCEEDINGS OF THE 2016 ACM MULTIMEDIA CONFERENCE, 2016, : 52 - 56
  • [10] Extracting Addresses from Unstructured Text using Bi-directional Recurrent Neural Networks
    Srivastava, Shivin
    [J]. 2018 18TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2018, : 1511 - 1513