FD-Align: Feature Discrimination Alignment for Fine-tuning Pre-Trained Models in Few-Shot Learning

被引:0
|
作者
Song, Kun [1 ]
Ma, Huimin [1 ]
Zou, Bochao [1 ]
Zhang, Huishuai [3 ]
Huang, Weiran [2 ]
机构
[1] Univ Sci & Technol Beijing, SCCE, Beijing, Peoples R China
[2] Shanghai Jiao Tong Univ, SEIEE, Qing Yuan Res Inst, Shanghai, Peoples R China
[3] Microsoft Res Asia, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the limited availability of data, existing few-shot learning methods trained from scratch fail to achieve satisfactory performance. In contrast, large-scale pre-trained models such as CLIP demonstrate remarkable few-shot and zero-shot capabilities. To enhance the performance of pre-trained models for downstream tasks, fine-tuning the model on downstream data is frequently necessary. However, fine-tuning the pre-trained model leads to a decrease in its generalizability in the presence of distribution shift, while the limited number of samples in few-shot learning makes the model highly susceptible to overfitting. Consequently, existing methods for fine-tuning few-shot learning primarily focus on fine-tuning the model's classification head or introducing additional structure. In this paper, we introduce a fine-tuning approach termed Feature Discrimination Alignment (FD-Align). Our method aims to bolster the model's generalizability by preserving the consistency of spurious features across the fine-tuning process. Extensive experimental results validate the efficacy of our approach for both ID and OOD tasks. Once fine-tuned, the model can seamlessly integrate with existing methods, leading to performance improvements. Our code could be found in https://github.com/skingorz/FD-Align.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Exploring Few-Shot Fine-Tuning Strategies for Models of Visually Grounded Speech
    Miller, Tyler
    Harwath, David
    INTERSPEECH 2022, 2022, : 1416 - 1420
  • [22] Pruning Pre-trained Language ModelsWithout Fine-Tuning
    Jiang, Ting
    Wang, Deqing
    Zhuang, Fuzhen
    Xie, Ruobing
    Xia, Feng
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 594 - 605
  • [23] Fine-Tuning of CLIP in Few-Shot Scenarios via Supervised Contrastive Learning
    Luo, Jing
    Wu, Guangxing
    Liu, Hongmei
    Wang, Ruixuan
    PATTERN RECOGNITION AND COMPUTER VISION, PT III, PRCV 2024, 2025, 15033 : 104 - 117
  • [24] Visual semantic alignment network based on pre-trained ViT for few-shot image classification
    Zhang, Jiaming
    Wu, Jijie
    Li, Xiaoxu
    2024 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2024,
  • [25] Revisiting k-NN for Fine-Tuning Pre-trained Language Models
    Li, Lei
    Chen, Jing
    Tian, Botzhong
    Zhang, Ningyu
    CHINESE COMPUTATIONAL LINGUISTICS, CCL 2023, 2023, 14232 : 327 - 338
  • [26] Fine-tuning the hyperparameters of pre-trained models for solving multiclass classification problems
    Kaibassova, D.
    Nurtay, M.
    Tau, A.
    Kissina, M.
    COMPUTER OPTICS, 2022, 46 (06) : 971 - 979
  • [27] TOKEN Is a MASK: Few-shot Named Entity Recognition with Pre-trained Language Models
    Davody, Ali
    Adelani, David Ifeoluwa
    Kleinbauer, Thomas
    Klakow, Dietrich
    TEXT, SPEECH, AND DIALOGUE (TSD 2022), 2022, 13502 : 138 - 150
  • [28] Defending Pre-trained Language Models as Few-shot Learners against Backdoor Attacks
    Xi, Zhaohan
    Du, Tianyu
    Li, Changjiang
    Pang, Ren
    Ji, Shouling
    Chen, Jinghui
    Ma, Fenglong
    Wang, Ting
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [29] Fine-Tuning Pre-Trained Language Models Effectively by Optimizing Subnetworks Adaptively
    Zhang, Haojie
    Li, Ge
    Li, Jia
    Zhang, Zhongjin
    Zhu, Yuqi
    Jin, Zhi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [30] An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models
    Liu, Xueqing
    Wang, Chi
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 2286 - 2300