Cross-modal discriminant adversarial network

被引:16
|
作者
Hu, Peng [1 ,2 ]
Peng, Xi [1 ]
Zhu, Hongyuan [2 ]
Lin, Jie [2 ]
Zhen, Liangli [3 ]
Wang, Wei [1 ]
Peng, Dezhong [1 ,4 ,5 ]
机构
[1] Sichuan Univ, Coll Comp Sci, Chengdu 610065, Peoples R China
[2] Agcy Sci Technol & Res, Inst Infocomm Res, Singapore, Singapore
[3] Agcy Sci Technol & Res, Inst High Performance Comp, Singapore, Singapore
[4] Shenzhen Peng Cheng Lab, Shenzhen 518052, Peoples R China
[5] Southwest Univ, Coll Comp & Informat Sci, Chongqing 400715, Peoples R China
基金
中国国家自然科学基金;
关键词
Adversarial learning; Cross-modal representation learning; Cross-modal retrieval; Discriminant adversarial network; Cross-modal discriminant mechanism; Latent common space;
D O I
10.1016/j.patcog.2020.107734
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-modal retrieval aims at retrieving relevant points across different modalities, such as retrieving images via texts. One key challenge of cross-modal retrieval is narrowing the heterogeneous gap across diverse modalities. To overcome this challenge, we propose a novel method termed as Cross-modal discriminant Adversarial Network (CAN). Taking bi-modal data as a showcase, CAN consists of two parallel modality-specific generators, two modality-specific discriminators, and a Cross-modal Discriminant Mechanism (CDM). To be specific, the generators project diverse modalities into a latent cross-modal discriminant space. Meanwhile, the discriminators compete against the generators to alleviate the heterogeneous discrepancy in this space, i.e., the generators try to generate unified features to confuse the discriminators, and the discriminators aim to classify the generated results. To further remove the redundancy and preserve the discrimination, we propose CDM to project the generated results into a single common space, accompanying with a novel eigenvalue-based loss. Thanks to the eigenvalue-based loss, CDM could push as much discriminative power as possible into all latent directions. To demonstrate the effectiveness of our CAN, comprehensive experiments are conducted on four multimedia datasets comparing with 15 state-of-the-art approaches. (C) 2020 Elsevier Ltd. All rights reserved.Y
引用
收藏
页数:14
相关论文
共 50 条
  • [11] Cross-modal Adversarial Reprogramming
    Neekhara, Paarth
    Hussain, Shehzeen
    Du, Jinglong
    Dubnov, Shlomo
    Koushanfar, Farinaz
    McAuley, Julian
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 2898 - 2906
  • [12] Information Aggregation Semantic Adversarial Network for Cross-Modal Retrieval
    Wang, Hongfei
    Feng, Aimin
    Liu, Xuejun
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [13] Adversarial Modality Alignment Network for Cross-Modal Molecule Retrieval
    Zhao W.
    Zhou D.
    Cao B.
    Zhang K.
    Chen J.
    IEEE Transactions on Artificial Intelligence, 2024, 5 (01): : 278 - 289
  • [14] SEMANTIC PRESERVING GENERATIVE ADVERSARIAL NETWORK FOR CROSS-MODAL HASHING
    Wu, Fei
    Luo, Xiaokai
    Huang, Qinghua
    Wei, Pengfei
    Sun, Ying
    Dong, Xiwei
    Wu, Zhiyong
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2743 - 2747
  • [15] Cross-Modal Rumor Detection Based on Adversarial Neural Network
    Jiana, Meng
    Xiaopei, Wang
    Ting, Li
    Shuang, Liu
    Di, Zhao
    Data Analysis and Knowledge Discovery, 2022, 6 (12) : 32 - 42
  • [16] MHTN: Modal-Adversarial Hybrid Transfer Network for Cross-Modal Retrieval
    Huang, Xin
    Peng, Yuxin
    Yuan, Mingkuan
    IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (03) : 1047 - 1059
  • [17] Adversarial Graph Attention Network for Multi-modal Cross-modal Retrieval
    Wu, Hongchang
    Guan, Ziyu
    Zhi, Tao
    zhao, Wei
    Xu, Cai
    Han, Hong
    Yang, Yarning
    2019 10TH IEEE INTERNATIONAL CONFERENCE ON BIG KNOWLEDGE (ICBK 2019), 2019, : 265 - 272
  • [18] Modal-adversarial Semantic Learning Network for Extendable Cross-modal Retrieval
    Xu, Xing
    Song, Jingkuan
    Lu, Huimin
    Yang, Yang
    Shen, Fumin
    Huang, Zi
    ICMR '18: PROCEEDINGS OF THE 2018 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2018, : 46 - 54
  • [19] Cross-modal deep discriminant analysis
    Dai, Xue-mei
    Li, Sheng-Gang
    NEUROCOMPUTING, 2018, 314 : 437 - 444
  • [20] Cross-Modal Learning with Adversarial Samples
    Li, Chao
    Deng, Cheng
    Gao, Shangqian
    Xie, De
    Liu, Wei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32