Deep Multigraph Hierarchical Enhanced Semantic Representation for Cross-Modal Retrieval

被引:4
|
作者
Zhu, Lei [1 ]
Zhang, Chengyuan [2 ]
Song, Jiayu [3 ]
Zhang, Shichao [4 ]
Tian, Chunwei [5 ]
Zhu, Xinghui [6 ]
机构
[1] Hunan Agr Univ, Coll Informat & Intelligence, Changsha 410127, Hunan, Peoples R China
[2] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410012, Hunan, Peoples R China
[3] Cent South Univ, Comp Sci & Technol, Changsha 410017, Hunan, Peoples R China
[4] Cent South Univ, Sch Comp Sci & Technol, Changsha 410017, Hunan, Peoples R China
[5] Northwestern Polytech Univ, Sch Software, Xian 710060, Peoples R China
[6] Hunan Agr Univ, Changsha, Hunan, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantics; Adversarial machine learning; Correlation; Visualization; Generators; Generative adversarial networks; Computer science;
D O I
10.1109/MMUL.2022.3144138
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The main challenge of cross-modal retrieval is how to efficiently realize cross-modal semantic alignment and reduce the heterogeneity gap. However, existing approaches either ignore the multigrained semantic knowledge learning from different modalities, or fail to learn consistent relation distributions of semantic details in multimodal instances. To this end, this article proposes a novel end-to-end cross-modal representation method, termed as deep multigraph-based hierarchical enhanced semantic representation (MG-HESR). This method is an integration of MG-HESR with cross-modal adversarial learning, which captures multigrained semantic knowledge from cross-modal samples and realizes fine-grained semantic relation distribution alignment, and then generates modalities-invariant representations in a common subspace. To evaluate the performance, extensive experiments are conducted on four benchmarks. The experimental results show that our method is superior than the state-of-the-art methods.
引用
收藏
页码:17 / 26
页数:10
相关论文
共 50 条
  • [21] Semantic-enhanced discriminative embedding learning for cross-modal retrieval
    Pan, Hao
    Huang, Jun
    INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2022, 11 (03) : 369 - 382
  • [22] Latent semantic-enhanced discrete hashing for cross-modal retrieval
    Liu, Yun
    Ji, Shujuan
    Fu, Qiang
    Zhao, Jianli
    Zhao, Zhongying
    Gong, Maoguo
    APPLIED INTELLIGENCE, 2022, 52 (14) : 16004 - 16020
  • [23] Latent semantic-enhanced discrete hashing for cross-modal retrieval
    Yun Liu
    Shujuan Ji
    Qiang Fu
    Jianli Zhao
    Zhongying Zhao
    Maoguo Gong
    Applied Intelligence, 2022, 52 : 16004 - 16020
  • [24] Scalable semantic-enhanced supervised hashing for cross-modal retrieval
    Yang, Fan
    Ding, Xiaojian
    Liu, Yufeng
    Ma, Fumin
    Cao, Jie
    KNOWLEDGE-BASED SYSTEMS, 2022, 251
  • [25] Cross-modal Semantic Enhanced Interaction for Image-Sentence Retrieval
    Ge, Xuri
    Chen, Fuhai
    Xu, Songpei
    Tao, Fuxiang
    Jose, Joemon M.
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 1022 - 1031
  • [26] Learning Hierarchical Semantic Correspondences for Cross-Modal Image-Text Retrieval
    Zeng, Sheng
    Liu, Changhong
    Zhou, Jun
    Chen, Yong
    Jiang, Aiwen
    Li, Hanxi
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2022, 2022, : 239 - 248
  • [27] Adversarial Learning-Based Semantic Correlation Representation for Cross-Modal Retrieval
    Zhu, Lei
    Song, Jiayu
    Zhu, Xiaofeng
    Zhang, Chengyuan
    Zhang, Shichao
    Yuan, Xinpan
    IEEE MULTIMEDIA, 2020, 27 (04) : 79 - 90
  • [28] Deep Supervised Cross-modal Retrieval
    Zhen, Liangli
    Hu, Peng
    Wang, Xu
    Peng, Dezhong
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10386 - 10395
  • [29] Semantic deep cross-modal hashing
    Lin, Qiubin
    Cao, Wenming
    He, Zhihai
    He, Zhiquan
    NEUROCOMPUTING, 2020, 396 (396) : 113 - 122
  • [30] Semantic consistency hashing for cross-modal retrieval
    Yao, Tao
    Kong, Xiangwei
    Fu, Haiyan
    Tian, Qi
    NEUROCOMPUTING, 2016, 193 : 250 - 259