Hybrid DAER Based Cross-Modal Retrieval Exploiting Deep Representation Learning

被引:0
|
作者
Huang, Zhao [1 ,2 ]
Hu, Haowu [2 ]
Su, Miao [2 ]
机构
[1] Minist Educ, Key Lab Modern Teaching Technol, Xian 710062, Peoples R China
[2] Shaanxi Normal Univ, Sch Comp Sci, Xian 710119, Peoples R China
基金
中国国家自然科学基金;
关键词
dual attention network; data augmentation; cross-modal retrieval; enhanced relation network; CANONICAL CORRELATION-ANALYSIS; NETWORK;
D O I
10.3390/e25081216
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Information retrieval across multiple modes has attracted much attention from academics and practitioners. One key challenge of cross-modal retrieval is to eliminate the heterogeneous gap between different patterns. Most of the existing methods tend to jointly construct a common subspace. However, very little attention has been given to the study of the importance of different fine-grained regions of various modalities. This lack of consideration significantly influences the utilization of the extracted information of multiple modalities. Therefore, this study proposes a novel text-image cross-modal retrieval approach that constructs a dual attention network and an enhanced relation network (DAER). More specifically, the dual attention network tends to precisely extract fine-grained weight information from text and images, while the enhanced relation network is used to expand the differences between different categories of data in order to improve the computational accuracy of similarity. The comprehensive experimental results on three widely-used major datasets (i.e., Wikipedia, Pascal Sentence, and XMediaNet) show that our proposed approach is effective and superior to existing cross-modal retrieval methods.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] Deep Multigraph Hierarchical Enhanced Semantic Representation for Cross-Modal Retrieval
    Zhu, Lei
    Zhang, Chengyuan
    Song, Jiayu
    Zhang, Shichao
    Tian, Chunwei
    Zhu, Xinghui
    IEEE MULTIMEDIA, 2022, 29 (03) : 17 - 26
  • [22] Deep Semantic Correlation Learning based Hashing for Multimedia Cross-Modal Retrieval
    Gong, Xiaolong
    Huang, Linpeng
    Wang, Fuwei
    2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 117 - 126
  • [23] Adaptively Efficient Deep Cross-Modal Hash Retrieval Based on Incremental Learning
    Zhou, Kun
    Xu, Liming
    Zheng, Bochuan
    Xie, Yicai
    Computer Engineering and Applications, 2024, 59 (02) : 85 - 93
  • [24] Deep Supervised Cross-modal Retrieval
    Zhen, Liangli
    Hu, Peng
    Wang, Xu
    Peng, Dezhong
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10386 - 10395
  • [25] Cross-Modal Retrieval using Random Multimodal Deep Learning
    Somasekar, Hemanth
    Naveen, Kavya
    JOURNAL OF MECHANICS OF CONTINUA AND MATHEMATICAL SCIENCES, 2019, 14 (02): : 185 - 200
  • [26] Deep Evidential Learning with Noisy Correspondence for Cross-modal Retrieval
    Qin, Yang
    Peng, Dezhong
    Peng, Xi
    Wang, Xu
    Hu, Peng
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4948 - 4956
  • [27] Deep Semantic Correlation with Adversarial Learning for Cross-Modal Retrieval
    Hua, Yan
    Du, Jianhe
    PROCEEDINGS OF 2019 IEEE 9TH INTERNATIONAL CONFERENCE ON ELECTRONICS INFORMATION AND EMERGENCY COMMUNICATION (ICEIEC 2019), 2019, : 252 - 255
  • [28] DRSL: Deep Relational Similarity Learning for Cross-modal Retrieval
    Wang, Xu
    Hu, Peng
    Zhen, Liangli
    Peng, Dezhong
    INFORMATION SCIENCES, 2021, 546 : 298 - 311
  • [29] Natural Language-Based Vehicle Retrieval with Explicit Cross-Modal Representation Learning
    Xu, Bocheng
    Xiong, Yihua
    Zhang, Rui
    Feng, Yanyi
    Wu, Haifeng
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 3141 - 3148
  • [30] Adversarial cross-modal retrieval based on dictionary learning
    Shang, Fei
    Zhang, Huaxiang
    Zhu, Lei
    Sun, Jiande
    NEUROCOMPUTING, 2019, 355 : 93 - 104