Deep Multigraph Hierarchical Enhanced Semantic Representation for Cross-Modal Retrieval

被引:4
|
作者
Zhu, Lei [1 ]
Zhang, Chengyuan [2 ]
Song, Jiayu [3 ]
Zhang, Shichao [4 ]
Tian, Chunwei [5 ]
Zhu, Xinghui [6 ]
机构
[1] Hunan Agr Univ, Coll Informat & Intelligence, Changsha 410127, Hunan, Peoples R China
[2] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410012, Hunan, Peoples R China
[3] Cent South Univ, Comp Sci & Technol, Changsha 410017, Hunan, Peoples R China
[4] Cent South Univ, Sch Comp Sci & Technol, Changsha 410017, Hunan, Peoples R China
[5] Northwestern Polytech Univ, Sch Software, Xian 710060, Peoples R China
[6] Hunan Agr Univ, Changsha, Hunan, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantics; Adversarial machine learning; Correlation; Visualization; Generators; Generative adversarial networks; Computer science;
D O I
10.1109/MMUL.2022.3144138
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The main challenge of cross-modal retrieval is how to efficiently realize cross-modal semantic alignment and reduce the heterogeneity gap. However, existing approaches either ignore the multigrained semantic knowledge learning from different modalities, or fail to learn consistent relation distributions of semantic details in multimodal instances. To this end, this article proposes a novel end-to-end cross-modal representation method, termed as deep multigraph-based hierarchical enhanced semantic representation (MG-HESR). This method is an integration of MG-HESR with cross-modal adversarial learning, which captures multigrained semantic knowledge from cross-modal samples and realizes fine-grained semantic relation distribution alignment, and then generates modalities-invariant representations in a common subspace. To evaluate the performance, extensive experiments are conducted on four benchmarks. The experimental results show that our method is superior than the state-of-the-art methods.
引用
收藏
页码:17 / 26
页数:10
相关论文
共 50 条
  • [41] Deep noise mitigation and semantic reconstruction hashing for unsupervised cross-modal retrieval
    Zhang, Cheng
    Wan, Yuan
    Qiang, Haopeng
    NEURAL COMPUTING & APPLICATIONS, 2024, 36 (10): : 5383 - 5397
  • [42] Deep Semantic Correlation Learning based Hashing for Multimedia Cross-Modal Retrieval
    Gong, Xiaolong
    Huang, Linpeng
    Wang, Fuwei
    2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 117 - 126
  • [43] Multi-attention based semantic deep hashing for cross-modal retrieval
    Liping Zhu
    Gangyi Tian
    Bingyao Wang
    Wenjie Wang
    Di Zhang
    Chengyang Li
    Applied Intelligence, 2021, 51 : 5927 - 5939
  • [44] Label-Semantic-Enhanced Online Hashing for Efficient Cross-modal Retrieval
    Jiang, Xueting
    Liu, Xin
    Cheung, Yiu-ming
    Xu, Xing
    Zheng, Shukai
    Li, Taihao
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 984 - 989
  • [45] Learning Disentangled Representation for Cross-Modal Retrieval with Deep Mutual Information Estimation
    Guo, Weikuo
    Huang, Huaibo
    Kong, Xiangwei
    He, Ran
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 1712 - 1720
  • [46] Hybrid DAER Based Cross-Modal Retrieval Exploiting Deep Representation Learning
    Huang, Zhao
    Hu, Haowu
    Su, Miao
    ENTROPY, 2023, 25 (08)
  • [47] Improvement of deep cross-modal retrieval by generating real-valued representation
    Bhatt, Nikita
    Ganatra, Amit
    PEERJ COMPUTER SCIENCE, 2021, 7 : 1 - 18
  • [48] Hierarchical Set-to-Set Representation for 3-D Cross-Modal Retrieval
    Jiang, Yu
    Hua, Cong
    Feng, Yifan
    Gao, Yue
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (01) : 1302 - 1314
  • [49] Cross-Modal Retrieval Using Deep Learning
    Malik, Shaily
    Bhardwaj, Nikhil
    Bhardwaj, Rahul
    Kumar, Saurabh
    PROCEEDINGS OF THIRD DOCTORAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE, DOSCI 2022, 2023, 479 : 725 - 734
  • [50] Generalized Semantic Preserving Hashing for Cross-Modal Retrieval
    Mandal, Devraj
    Chaudhury, Kunal N.
    Biswas, Soma
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (01) : 102 - 112