Cross-modal discriminant adversarial network

被引:16
|
作者
Hu, Peng [1 ,2 ]
Peng, Xi [1 ]
Zhu, Hongyuan [2 ]
Lin, Jie [2 ]
Zhen, Liangli [3 ]
Wang, Wei [1 ]
Peng, Dezhong [1 ,4 ,5 ]
机构
[1] Sichuan Univ, Coll Comp Sci, Chengdu 610065, Peoples R China
[2] Agcy Sci Technol & Res, Inst Infocomm Res, Singapore, Singapore
[3] Agcy Sci Technol & Res, Inst High Performance Comp, Singapore, Singapore
[4] Shenzhen Peng Cheng Lab, Shenzhen 518052, Peoples R China
[5] Southwest Univ, Coll Comp & Informat Sci, Chongqing 400715, Peoples R China
基金
中国国家自然科学基金;
关键词
Adversarial learning; Cross-modal representation learning; Cross-modal retrieval; Discriminant adversarial network; Cross-modal discriminant mechanism; Latent common space;
D O I
10.1016/j.patcog.2020.107734
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-modal retrieval aims at retrieving relevant points across different modalities, such as retrieving images via texts. One key challenge of cross-modal retrieval is narrowing the heterogeneous gap across diverse modalities. To overcome this challenge, we propose a novel method termed as Cross-modal discriminant Adversarial Network (CAN). Taking bi-modal data as a showcase, CAN consists of two parallel modality-specific generators, two modality-specific discriminators, and a Cross-modal Discriminant Mechanism (CDM). To be specific, the generators project diverse modalities into a latent cross-modal discriminant space. Meanwhile, the discriminators compete against the generators to alleviate the heterogeneous discrepancy in this space, i.e., the generators try to generate unified features to confuse the discriminators, and the discriminators aim to classify the generated results. To further remove the redundancy and preserve the discrimination, we propose CDM to project the generated results into a single common space, accompanying with a novel eigenvalue-based loss. Thanks to the eigenvalue-based loss, CDM could push as much discriminative power as possible into all latent directions. To demonstrate the effectiveness of our CAN, comprehensive experiments are conducted on four multimedia datasets comparing with 15 state-of-the-art approaches. (C) 2020 Elsevier Ltd. All rights reserved.Y
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Cross-modal and Cross-medium Adversarial Attack for Audio
    Zhang, Liguo
    Tian, Zilin
    Long, Yunfei
    Li, Sizhao
    Yin, Guisheng
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 444 - 453
  • [32] Category Alignment Adversarial Learning for Cross-Modal Retrieval
    He, Shiyuan
    Wang, Weiyang
    Wang, Zheng
    Xu, Xing
    Yang, Yang
    Wang, Xiaoming
    Shen, Heng Tao
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (05) : 4527 - 4538
  • [33] Adversarial cross-modal retrieval based on dictionary learning
    Shang, Fei
    Zhang, Huaxiang
    Zhu, Lei
    Sun, Jiande
    NEUROCOMPUTING, 2019, 355 : 93 - 104
  • [34] Adversarial Cross-Modal Retrieval Based on Association Constraint
    Guo Q.
    Qian Y.
    Liang X.
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2021, 34 (01): : 68 - 76
  • [35] Independency Adversarial Learning for Cross-Modal Sound Separation
    Lin, Zhenkai
    Ji, Yanli
    Yang, Yang
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 4, 2024, : 3522 - 3530
  • [36] Representation separation adversarial networks for cross-modal retrieval
    Deng, Jiaxin
    Ou, Weihua
    Gou, Jianping
    Song, Heping
    Wang, Anzhi
    Xu, Xing
    WIRELESS NETWORKS, 2024, 30 (05) : 3469 - 3481
  • [37] Adaptive Adversarial Learning based cross-modal retrieval
    Li, Zhuoyi
    Lu, Huibin
    Fu, Hao
    Wang, Zhongrui
    Gu, Guanghun
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 123
  • [38] Discrete Fusion Adversarial Hashing for cross-modal retrieval
    Li, Jing
    Yu, En
    Ma, Jianhua
    Chang, Xiaojun
    Zhang, Huaxiang
    Sun, Jiande
    KNOWLEDGE-BASED SYSTEMS, 2022, 253
  • [39] Dual Subspaces with Adversarial Learning for Cross-Modal Retrieval
    Xia, Yaxian
    Wang, Wenmin
    Han, Liang
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT I, 2018, 11164 : 654 - 663
  • [40] Deep adversarial metric learning for cross-modal retrieval
    Xing Xu
    Li He
    Huimin Lu
    Lianli Gao
    Yanli Ji
    World Wide Web, 2019, 22 : 657 - 672