Deep supervised multimodal semantic autoencoder for cross-modal retrieval

被引:1
|
作者
Tian, Yu [1 ,2 ]
Yang, Wenjing [1 ,2 ]
Liu, Qingsong [3 ]
Yang, Qiong [1 ,2 ]
机构
[1] Natl Univ Def Technol, Inst Quantum Informat, Coll Comp, Changsha 410073, Hunan, Peoples R China
[2] Natl Univ Def Technol, State Key Lab High Performance Comp, Coll Comp, Changsha 410073, Hunan, Peoples R China
[3] Naval Aeronaut Univ, Yantai, Peoples R China
关键词
autoencoder; cross-modal retrieval; semantic-aware feature vectors;
D O I
10.1002/cav.1962
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Cross-modal retrieval aims to do flexible retrieval among different modals, whose main issue is how to measure the semantic similarities among multimodal data. Though many existing methods have been proposed to enable cross-modal retrieval, they rarely consider the preservation of content information among multimodal data. In this paper, we present a three-stage cross-modal retrieval method, namedMMCA-CMR. To reduce the discrepancy among multimodal data, we first attempt to embed multimodal data into a common representation space. We then combine the feature vectors with the content information into the semantic-aware feature vectors. We finally obtain the feature-aware and content-aware projections via multimodal semantic autoencoders. With semantic deep autoencoders, MMCA-CMR promotes a more reliable cross-modal retrieval by learning feature vectors from different modalities and content information simultaneously. Extensive experiments demonstrate that the proposed method is valid in cross-modal retrieval, which significantly outperforms state-of-the-art on four widely-used benchmark datasets.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Deep Supervised Cross-modal Retrieval
    Zhen, Liangli
    Hu, Peng
    Wang, Xu
    Peng, Dezhong
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10386 - 10395
  • [2] Multi-modal semantic autoencoder for cross-modal retrieval
    Wu, Yiling
    Wang, Shuhui
    Huang, Qingming
    [J]. NEUROCOMPUTING, 2019, 331 : 165 - 175
  • [3] Deep Semantic Mapping for Cross-Modal Retrieval
    Wang, Cheng
    Yang, Haojin
    Meinel, Christoph
    [J]. 2015 IEEE 27TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2015), 2015, : 234 - 241
  • [4] ONION: Online Semantic Autoencoder Hashing for Cross-Modal Retrieval
    Zhang, Donglin
    Wu, Xiao-Jun
    Chen, Guoqing
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2023, 14 (02)
  • [5] Supervised Hierarchical Deep Hashing for Cross-Modal Retrieval
    Zhan, Yu-Wei
    Luo, Xin
    Wang, Yongxin
    Xu, Xin-Shun
    [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 3386 - 3394
  • [6] Deep Multimodal Transfer Learning for Cross-Modal Retrieval
    Zhen, Liangli
    Hu, Peng
    Peng, Xi
    Goh, Rick Siow Mong
    Zhou, Joey Tianyi
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (02) : 798 - 810
  • [7] DEEP SEMANTIC ADVERSARIAL HASHING BASED ON AUTOENCODER FOR LARGE-SCALE CROSS-MODAL RETRIEVAL
    Li, Mingyong
    Wang, Hongya
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (ICMEW), 2020,
  • [8] Scalable Deep Multimodal Learning for Cross-Modal Retrieval
    Hu, Peng
    Zhen, Liangli
    Peng, Dezhong
    Liu, Pei
    [J]. PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19), 2019, : 635 - 644
  • [9] Cross-modal Retrieval with Correspondence Autoencoder
    Feng, Fangxiang
    Wang, Xiaojie
    Li, Ruifan
    [J]. PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 7 - 16
  • [10] Autoencoder-based self-supervised hashing for cross-modal retrieval
    Li, Yifan
    Wang, Xuan
    Cui, Lei
    Zhang, Jiajia
    Huang, Chengkai
    Luo, Xuan
    Qi, Shuhan
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (11) : 17257 - 17274