Feature Mixture on Pre-Trained Model for Few-Shot Learning

被引:0
|
作者
Wang, Shuo [1 ,2 ]
Lu, Jinda [3 ]
Xu, Haiyang [4 ]
Hao, Yanbin [1 ,2 ]
He, Xiangnan [1 ,2 ]
机构
[1] Univ Sci & Technol China, Sch Data Sci, Hefei, Anhui, Peoples R China
[2] Univ Sci & Technol China, Sch Informat Sci & Technol, Hefei, Anhui, Peoples R China
[3] Univ Sci & Technol China, Sch Cyber Sci & Technol, Hefei 230009, Anhui, Peoples R China
[4] Univ Sci & Technol China, Sch Gifted Young, Hefei 230009, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
Training; Task analysis; Feature extraction; Data augmentation; Robustness; Generative adversarial networks; Costs; Few-shot learning; feature mixture; data augmentation; inductive learning; transductive learning;
D O I
10.1109/TIP.2024.3411452
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Few-shot learning (FSL) aims at recognizing a novel object under limited training samples. A robust feature extractor (backbone) can significantly improve the recognition performance of the FSL model. However, training an effective backbone is a challenging issue since 1) designing and validating structures of backbones are time-consuming and expensive processes, and 2) a backbone trained on the known (base) categories is more inclined to focus on the textures of the objects it learns, which is hard to describe the novel samples. To solve these problems, we propose a feature mixture operation on the pre-trained (fixed) features: 1) We replace a part of the values of the feature map from a novel category with the content of other feature maps to increase the generalizability and diversity of training samples, which avoids retraining a complex backbone with high computational costs. 2) We use the similarities between the features to constrain the mixture operation, which helps the classifier focus on the representations of the novel object where these representations are hidden in the features from the pre-trained backbone with biased training. Experimental studies on five benchmark datasets in both inductive and transductive settings demonstrate the effectiveness of our feature mixture (FM). Specifically, compared with the baseline on the Mini-ImageNet dataset, it achieves 3.8% and 4.2% accuracy improvements for 1 and 5 training samples, respectively. Additionally, the proposed mixture operation can be used to improve other existing FSL methods based on backbone training.
引用
收藏
页码:4104 / 4115
页数:12
相关论文
共 50 条
  • [1] Few-Shot NLG with Pre-Trained Language Model
    Chen, Zhiyu
    Eavani, Harini
    Chen, Wenhu
    Liu, Yinyin
    Wang, William Yang
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 183 - 190
  • [2] PPT: Pre-trained Prompt Tuning for Few-shot Learning
    Gu, Yuxian
    Han, Xu
    Liu, Zhiyuan
    Huang, Minlie
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 8410 - 8423
  • [3] Better Few-Shot Text Classification with Pre-trained Language Model
    Chen, Zheng
    Zhang, Yunchen
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT II, 2021, 12892 : 537 - 548
  • [4] FD-Align: Feature Discrimination Alignment for Fine-tuning Pre-Trained Models in Few-Shot Learning
    Song, Kun
    Ma, Huimin
    Zou, Bochao
    Zhang, Huishuai
    Huang, Weiran
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [5] Making Pre-trained Language Models Better Few-shot Learners
    Gao, Tianyu
    Fisch, Adam
    Chen, Danqi
    [J]. 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 3816 - 3830
  • [6] Investigating Prompt Learning for Chinese Few-Shot Text Classification with Pre-Trained Language Models
    Song, Chengyu
    Shao, Taihua
    Lin, Kejing
    Liu, Dengfeng
    Wang, Siyuan
    Chen, Honghui
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (21):
  • [7] Few-shot Image Classification: Just Use a Library of Pre-trained Feature Extractors and a Simple Classifier
    Chowdhury, Arkabandhu
    Jiang, Mingchao
    Chaudhuri, Swarat
    Jermaine, Chris
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9425 - 9434
  • [8] Pathologies of Pre-trained Language Models in Few-shot Fine-tuning
    Chen, Hanjie
    Zheng, Guoqing
    Awadallah, Ahmed Hassan
    Ji, Yangfeng
    [J]. PROCEEDINGS OF THE THIRD WORKSHOP ON INSIGHTS FROM NEGATIVE RESULTS IN NLP (INSIGHTS 2022), 2022, : 144 - 153
  • [9] Defending Pre-trained Language Models as Few-shot Learners against Backdoor Attacks
    Xi, Zhaohan
    Du, Tianyu
    Li, Changjiang
    Pang, Ren
    Ji, Shouling
    Chen, Jinghui
    Ma, Fenglong
    Wang, Ting
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [10] TOKEN Is a MASK: Few-shot Named Entity Recognition with Pre-trained Language Models
    Davody, Ali
    Adelani, David Ifeoluwa
    Kleinbauer, Thomas
    Klakow, Dietrich
    [J]. TEXT, SPEECH, AND DIALOGUE (TSD 2022), 2022, 13502 : 138 - 150