MFF: Multi-modal feature fusion for zero-shot learning

被引:8
|
作者
Cao, Weipeng [1 ,2 ]
Wu, Yuhao [2 ]
Huang, Chengchao [3 ]
Patwary, Muhammed J. A. [4 ]
Wang, Xizhao [2 ]
机构
[1] Civil Aviat Univ China, CAAC Key Lab Civil Aviat Wide Surveillance & Safet, Tianjin 300300, Peoples R China
[2] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Guangdong, Peoples R China
[3] Chinese Acad Sci, Nanjing Inst Software Technol, Nanjing 210000, Jiangsu, Peoples R China
[4] Int Islamic Univ Chittagong, Dept Comp Sci & Engn, Chattogram 4318, Bangladesh
基金
中国国家自然科学基金;
关键词
Zero -shot learning; Generative method; Variational auto -encoder; Generative adversarial network; Feature fusion;
D O I
10.1016/j.neucom.2022.09.070
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generative Zero-Shot Learning (ZSL) methods generally generate pseudo-samples/features based on the semantic description information of unseen classes, thereby transforming ZSL tasks into traditional supervised learning tasks. Under this learning paradigm, the quality of pseudo-samples/features guided by the classes' semantic description information is the key to the success of the model. However, the semantic description information used in the existing generative methods is mainly the lowdimensional representation (e.g., attributes) of classes, which leads to the low quality of the generated pseudo-samples/features and may aggravate the problem of domain shift. To alleviate this problem, we introduce the visual principal component feature, which is extracted by a principal component analysis network, to make up for the deficiency of using only semantic description information and propose a novel Variational Auto-Encoder (VAE) and Generative Adversarial Network (GAN) based generative method for ZSL, which we call Multi-modal Feature Fusion algorithm (MFF). In MFF, the input of different modal information enables VAE better fit the original data distribution and the proposed alignment loss ensures the consistency of the generated visual features and the corresponding semantic features. With the help of high-quality pseudo-samples/features, the ZSL model can make more accurate predictions for unseen classes. Extensive experiments on five public datasets demonstrate that our proposed algorithm outperforms several state-of-the-art methods under both ZSL and generalized ZSL settings.(c) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页码:172 / 180
页数:9
相关论文
共 50 条
  • [21] AVGZSLNet: Audio-Visual Generalized Zero-Shot Learning by Reconstructing Label Features from Multi-Modal Embeddings
    Mazumder, Pratik
    Singh, Pravendra
    Parida, Kranti Kumar
    Namboodiri, Vinay P.
    [J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 3089 - 3098
  • [22] Faster Zero-shot Multi-modal Entity Linking via Visual-Linguistic Representation
    Qiushuo Zheng
    Hao Wen
    Meng Wang
    Guilin Qi
    Chaoyu Bai
    [J]. Data Intelligence, 2022, 4 (03) : 493 - 508
  • [23] Zero-shot stance detection based on multi-perspective transferable feature fusion
    Zhao, Xuechen
    Zou, Jiaying
    Miao, Jinfeng
    Tian, Lei
    Gao, Liqun
    Zhou, Bin
    Pang, Shengnan
    [J]. INFORMATION FUSION, 2024, 108
  • [24] Zero-shot stance detection based on multi-perspective transferable feature fusion
    Zhao, Xuechen
    Zou, Jiaying
    Miao, Jinfeng
    Tian, Lei
    Gao, Liqun
    Zhou, Bin
    Pang, Shengnan
    [J]. Information Fusion, 2024, 108
  • [25] Alleviating Feature Confusion for Generative Zero-shot Learning
    Li, Jingjing
    Jing, Mengmeng
    Lu, Ke
    Zhu, Lei
    Yang, Yang
    Huang, Zi
    [J]. PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 1587 - 1595
  • [26] FREE: Feature Refinement for Generalized Zero-Shot Learning
    Chen, Shiming
    Wang, Wenjie
    Xia, Beihao
    Peng, Qinmu
    You, Xinge
    Zheng, Feng
    Shao, Ling
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 122 - 131
  • [27] Semantic Feature Extraction for Generalized Zero-Shot Learning
    Kim, Junhan
    Shim, Kyuhong
    Shim, Byonghyo
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 1166 - 1173
  • [28] Transductive Zero-Shot Learning by Decoupled Feature Generation
    Marmoreo, Federico
    Cavazza, Jacopo
    Murino, Vittorio
    [J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 3108 - 3117
  • [29] ADAPTIVE MULTI-SCALE SEMANTIC FUSION NETWORK FOR ZERO-SHOT LEARNING
    Song, Jing
    Peng, Peixi
    Zhai, Yunpeng
    Zhang, Chong
    Tian, Yonghong
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2021,
  • [30] Unbiased feature generating for generalized zero-shot learning
    Niu, Chang
    Shang, Junyuan
    Huang, Junchu
    Yang, Junmei
    Song, Yuting
    Zhou, Zhiheng
    Zhou, Guoxu
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 89