Cross-modal discriminant adversarial network

被引:16
|
作者
Hu, Peng [1 ,2 ]
Peng, Xi [1 ]
Zhu, Hongyuan [2 ]
Lin, Jie [2 ]
Zhen, Liangli [3 ]
Wang, Wei [1 ]
Peng, Dezhong [1 ,4 ,5 ]
机构
[1] Sichuan Univ, Coll Comp Sci, Chengdu 610065, Peoples R China
[2] Agcy Sci Technol & Res, Inst Infocomm Res, Singapore, Singapore
[3] Agcy Sci Technol & Res, Inst High Performance Comp, Singapore, Singapore
[4] Shenzhen Peng Cheng Lab, Shenzhen 518052, Peoples R China
[5] Southwest Univ, Coll Comp & Informat Sci, Chongqing 400715, Peoples R China
基金
中国国家自然科学基金;
关键词
Adversarial learning; Cross-modal representation learning; Cross-modal retrieval; Discriminant adversarial network; Cross-modal discriminant mechanism; Latent common space;
D O I
10.1016/j.patcog.2020.107734
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-modal retrieval aims at retrieving relevant points across different modalities, such as retrieving images via texts. One key challenge of cross-modal retrieval is narrowing the heterogeneous gap across diverse modalities. To overcome this challenge, we propose a novel method termed as Cross-modal discriminant Adversarial Network (CAN). Taking bi-modal data as a showcase, CAN consists of two parallel modality-specific generators, two modality-specific discriminators, and a Cross-modal Discriminant Mechanism (CDM). To be specific, the generators project diverse modalities into a latent cross-modal discriminant space. Meanwhile, the discriminators compete against the generators to alleviate the heterogeneous discrepancy in this space, i.e., the generators try to generate unified features to confuse the discriminators, and the discriminators aim to classify the generated results. To further remove the redundancy and preserve the discrimination, we propose CDM to project the generated results into a single common space, accompanying with a novel eigenvalue-based loss. Thanks to the eigenvalue-based loss, CDM could push as much discriminative power as possible into all latent directions. To demonstrate the effectiveness of our CAN, comprehensive experiments are conducted on four multimedia datasets comparing with 15 state-of-the-art approaches. (C) 2020 Elsevier Ltd. All rights reserved.Y
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Dual discriminant adversarial cross-modal retrieval
    Pei He
    Meng Wang
    Ding Tu
    Zhuo Wang
    Applied Intelligence, 2023, 53 : 4257 - 4267
  • [2] Dual discriminant adversarial cross-modal retrieval
    He, Pei
    Wang, Meng
    Tu, Ding
    Wang, Zhuo
    APPLIED INTELLIGENCE, 2023, 53 (04) : 4257 - 4267
  • [3] Multimodal adversarial network for cross-modal retrieval
    Hu, Peng
    Peng, Dezhong
    Wang, Xu
    Xiang, Yong
    KNOWLEDGE-BASED SYSTEMS, 2019, 180 : 38 - 50
  • [4] Discriminant Adversarial Hashing Transformer for Cross-modal Vessel Image Retrieval
    Guan X.
    Guo J.
    Lu Y.
    Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2023, 45 (12): : 4411 - 4420
  • [5] Cross-Modal Surface Material Retrieval Using Discriminant Adversarial Learning
    Zheng, Wendong
    Liu, Huaping
    Wang, Bowen
    Sun, Fuchun
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2019, 15 (09) : 4978 - 4987
  • [6] Discriminant Cross-modal Hashing
    Xu, Xing
    Shen, Fumin
    Yang, Yang
    Shen, Heng Tao
    ICMR'16: PROCEEDINGS OF THE 2016 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2016, : 305 - 308
  • [7] DEEP ADVERSARIAL QUANTIZATION NETWORK FOR CROSS-MODAL RETRIEVAL
    Zhou, Yu
    Feng, Yong
    Zhou, Mingliang
    Qiang, Baohua
    Hou, Leong U.
    Zhu, Jiajie
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 4325 - 4329
  • [8] Adversarial Graph Convolutional Network for Cross-Modal Retrieval
    Dong, Xinfeng
    Liu, Li
    Zhu, Lei
    Nie, Liqiang
    Zhang, Huaxiang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (03) : 1634 - 1645
  • [9] Cross-modal dual subspace learning with adversarial network
    Shang, Fei
    Zhang, Huaxiang
    Sun, Jiande
    Nie, Liqiang
    Liu, Li
    NEURAL NETWORKS, 2020, 126 : 132 - 142
  • [10] Adversarial Cross-Modal Retrieval
    Wang, Bokun
    Yang, Yang
    Xu, Xing
    Hanjalic, Alan
    Shen, Heng Tao
    PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 154 - 162