Deep Multigraph Hierarchical Enhanced Semantic Representation for Cross-Modal Retrieval

被引:4
|
作者
Zhu, Lei [1 ]
Zhang, Chengyuan [2 ]
Song, Jiayu [3 ]
Zhang, Shichao [4 ]
Tian, Chunwei [5 ]
Zhu, Xinghui [6 ]
机构
[1] Hunan Agr Univ, Coll Informat & Intelligence, Changsha 410127, Hunan, Peoples R China
[2] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410012, Hunan, Peoples R China
[3] Cent South Univ, Comp Sci & Technol, Changsha 410017, Hunan, Peoples R China
[4] Cent South Univ, Sch Comp Sci & Technol, Changsha 410017, Hunan, Peoples R China
[5] Northwestern Polytech Univ, Sch Software, Xian 710060, Peoples R China
[6] Hunan Agr Univ, Changsha, Hunan, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantics; Adversarial machine learning; Correlation; Visualization; Generators; Generative adversarial networks; Computer science;
D O I
10.1109/MMUL.2022.3144138
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The main challenge of cross-modal retrieval is how to efficiently realize cross-modal semantic alignment and reduce the heterogeneity gap. However, existing approaches either ignore the multigrained semantic knowledge learning from different modalities, or fail to learn consistent relation distributions of semantic details in multimodal instances. To this end, this article proposes a novel end-to-end cross-modal representation method, termed as deep multigraph-based hierarchical enhanced semantic representation (MG-HESR). This method is an integration of MG-HESR with cross-modal adversarial learning, which captures multigrained semantic knowledge from cross-modal samples and realizes fine-grained semantic relation distribution alignment, and then generates modalities-invariant representations in a common subspace. To evaluate the performance, extensive experiments are conducted on four benchmarks. The experimental results show that our method is superior than the state-of-the-art methods.
引用
收藏
页码:17 / 26
页数:10
相关论文
共 50 条
  • [31] Analyzing semantic correlation for cross-modal retrieval
    Liang Xie
    Peng Pan
    Yansheng Lu
    Multimedia Systems, 2015, 21 : 525 - 539
  • [32] Analyzing semantic correlation for cross-modal retrieval
    Xie, Liang
    Pan, Peng
    Lu, Yansheng
    MULTIMEDIA SYSTEMS, 2015, 21 (06) : 525 - 539
  • [33] Hierarchical Consensus Hashing for Cross-Modal Retrieval
    Sun, Yuan
    Ren, Zhenwen
    Hu, Peng
    Peng, Dezhong
    Wang, Xu
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 824 - 836
  • [34] Hybrid representation learning for cross-modal retrieval
    Cao, Wenming
    Lin, Qiubin
    He, Zhihai
    He, Zhiquan
    NEUROCOMPUTING, 2019, 345 : 45 - 57
  • [35] Multi-modal semantic autoencoder for cross-modal retrieval
    Wu, Yiling
    Wang, Shuhui
    Huang, Qingming
    NEUROCOMPUTING, 2019, 331 : 165 - 175
  • [36] Multilevel Deep Semantic Feature Asymmetric Network for Cross-Modal Hashing Retrieval
    Jiang, Xiaolong
    Fan, Jiabao
    Zhang, Jie
    Lin, Ziyong
    Li, Mingyong
    IEEE LATIN AMERICA TRANSACTIONS, 2024, 22 (08) : 621 - 631
  • [37] Multi-attention based semantic deep hashing for cross-modal retrieval
    Zhu, Liping
    Tian, Gangyi
    Wang, Bingyao
    Wang, Wenjie
    Zhang, Di
    Li, Chengyang
    APPLIED INTELLIGENCE, 2021, 51 (08) : 5927 - 5939
  • [38] Cross-Modal Event Retrieval: A Dataset and a Baseline Using Deep Semantic Learning
    Situ, Runwei
    Yang, Zhenguo
    Lv, Jianming
    Li, Qing
    Liu, Wenyin
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2018, PT II, 2018, 11165 : 147 - 157
  • [39] Deep noise mitigation and semantic reconstruction hashing for unsupervised cross-modal retrieval
    Cheng Zhang
    Yuan Wan
    Haopeng Qiang
    Neural Computing and Applications, 2024, 36 : 5383 - 5397
  • [40] Deep Semantic-Preserving Reconstruction Hashing for Unsupervised Cross-Modal Retrieval
    Cheng, Shuli
    Wang, Liejun
    Du, Anyu
    ENTROPY, 2020, 22 (11) : 1 - 22