Cross-modal discriminant adversarial network

被引:16
|
作者
Hu, Peng [1 ,2 ]
Peng, Xi [1 ]
Zhu, Hongyuan [2 ]
Lin, Jie [2 ]
Zhen, Liangli [3 ]
Wang, Wei [1 ]
Peng, Dezhong [1 ,4 ,5 ]
机构
[1] Sichuan Univ, Coll Comp Sci, Chengdu 610065, Peoples R China
[2] Agcy Sci Technol & Res, Inst Infocomm Res, Singapore, Singapore
[3] Agcy Sci Technol & Res, Inst High Performance Comp, Singapore, Singapore
[4] Shenzhen Peng Cheng Lab, Shenzhen 518052, Peoples R China
[5] Southwest Univ, Coll Comp & Informat Sci, Chongqing 400715, Peoples R China
基金
中国国家自然科学基金;
关键词
Adversarial learning; Cross-modal representation learning; Cross-modal retrieval; Discriminant adversarial network; Cross-modal discriminant mechanism; Latent common space;
D O I
10.1016/j.patcog.2020.107734
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-modal retrieval aims at retrieving relevant points across different modalities, such as retrieving images via texts. One key challenge of cross-modal retrieval is narrowing the heterogeneous gap across diverse modalities. To overcome this challenge, we propose a novel method termed as Cross-modal discriminant Adversarial Network (CAN). Taking bi-modal data as a showcase, CAN consists of two parallel modality-specific generators, two modality-specific discriminators, and a Cross-modal Discriminant Mechanism (CDM). To be specific, the generators project diverse modalities into a latent cross-modal discriminant space. Meanwhile, the discriminators compete against the generators to alleviate the heterogeneous discrepancy in this space, i.e., the generators try to generate unified features to confuse the discriminators, and the discriminators aim to classify the generated results. To further remove the redundancy and preserve the discrimination, we propose CDM to project the generated results into a single common space, accompanying with a novel eigenvalue-based loss. Thanks to the eigenvalue-based loss, CDM could push as much discriminative power as possible into all latent directions. To demonstrate the effectiveness of our CAN, comprehensive experiments are conducted on four multimedia datasets comparing with 15 state-of-the-art approaches. (C) 2020 Elsevier Ltd. All rights reserved.Y
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Csan: cross-coupled semantic adversarial network for cross-modal retrieval
    Li, Zhuoyi
    Lu, Huibin
    Fu, Hao
    Meng, Fanzhen
    Gu, Guanghua
    ARTIFICIAL INTELLIGENCE REVIEW, 2025, 58 (05)
  • [22] Deep Supervised Dual Cycle Adversarial Network for Cross-Modal Retrieval
    Liao, Lei
    Yang, Meng
    Zhang, Bob
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (02) : 920 - 934
  • [23] Analysis and Validation of Cross-Modal Generative Adversarial Network for Sensory Substitution
    Kim, Mooseop
    Park, YunKyung
    Moon, KyeongDeok
    Jeong, Chi Yoon
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2021, 18 (12)
  • [24] A Cross-Modal Generative Adversarial Network for Scenarios Generation of Renewable Energy
    Kang, Mingyu
    Zhu, Ran
    Chen, Duxin
    Li, Chaojie
    Gu, Wei
    Qian, Xusheng
    Yu, Wenwu
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2024, 39 (02) : 2630 - 2640
  • [25] Unsupervised Generative Adversarial Cross-Modal Hashing
    Zhang, Jian
    Peng, Yuxin
    Yuan, Mingkuan
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 539 - 546
  • [26] Robust Cross-Modal Retrieval by Adversarial Training
    Zhang, Tao
    Sun, Shiliang
    Zhao, Jing
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [27] Augmented Adversarial Training for Cross-Modal Retrieval
    Wu, Yiling
    Wang, Shuhui
    Song, Guoli
    Huang, Qingming
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 559 - 571
  • [28] Modality-specific and shared generative adversarial network for cross-modal retrieval
    Wu, Fei
    Jing, Xiao-Yuan
    Wu, Zhiyong
    Ji, Yimu
    Dong, Xiwei
    Luo, Xiaokai
    Huang, Qinghua
    Wang, Ruchuan
    PATTERN RECOGNITION, 2020, 104
  • [29] Generative Adversarial Network Based Asymmetric Deep Cross-Modal Unsupervised Hashing
    Cao, Yuan
    Gao, Yaru
    Chen, Na
    Lin, Jiacheng
    Chen, Sheng
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2023, PT I, 2024, 14487 : 30 - 48
  • [30] Adversarial Tri-Fusion Hashing Network for Imbalanced Cross-Modal Retrieval
    Liu, Xin
    Cheung, Yiu-ming
    Hu, Zhikai
    He, Yi
    Zhong, Bineng
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2021, 5 (04): : 607 - 619