Meta-Adapters: Parameter Efficient Few-shot Fine-tuning through Meta-Learning

被引:0
|
作者
Bansal, Trapit [1 ]
Alzubi, Salaheddin [1 ]
Wang, Tong [2 ]
Lee, Jay-Yoon [1 ]
McCallum, Andrew [1 ]
机构
[1] Univ Massachusetts, Amherst, MA 01003 USA
[2] Microsoft Res, Montreal, PQ, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Consistent improvements in the representational capacity of large pre-trained transformers has made it increasingly viable to serve these models as shared priors that can be fine-tuned on a large number of downstream tasks. However, fine-tuning the entire model for every task of interest makes a copy of all the model parameters, rendering such scenarios highly impractical. Recently introduced Adapter methods propose a promising alternative, one where only a small number of additional parameters are introduced per task specifically for fine-tuning. However, Adapters often require large amounts of task-specific data for good performance and don't work well in data-scarce few-shot scenarios. In this paper, we approach parameter-efficient fine-tuning in few-shot settings from a meta-learning perspective. We introduce Meta-Adapters, which are small blocks of meta-learned adapter layers inserted in a pre-trained model that re-purpose a frozen pre-trained model into a parameter-efficient few-shot learner. Meta-Adapters perform competitively with state-of-the-art few-shot learning methods that require full fine-tuning, while only fine-tuning 0.6% of the parameters. We evaluate Meta-Adapters along with multiple transfer learning baselines on an evaluation suite of 17 classification tasks and find that they improve few-shot accuracy by a large margin over competitive parameter-efficient methods, while requiring significantly lesser parameters for fine-tuning. Moreover, when comparing few-shot prompting of GPT-3 against few-shot fine-tuning with Meta-Adapters, we find that Meta-Adapters perform competitively while working with pre-trained transformers that are many orders of magnitude (1590x) smaller in size than GPT-3.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] Few-shot Edge Classification in Graph Meta-learning
    Yang, Xiaoxiao
    Xu, Jungang
    2022 IEEE 9TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2022, : 166 - 172
  • [22] Meta-Baseline: Exploring Simple Meta-Learning for Few-Shot Learning
    Chen, Yinbo
    Liu, Zhuang
    Xu, Huijuan
    Darrell, Trevor
    Wang, Xiaolong
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9042 - 9051
  • [23] Decomposed Meta-Learning for Few-Shot Sequence Labeling
    Ma, Tingting
    Wu, Qianhui
    Jiang, Huiqiang
    Lin, Jieru
    Karlsson, Borje F.
    Zhao, Tiejun
    Lin, Chin-Yew
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1980 - 1993
  • [24] Meta-Learning for Few-Shot Plant Disease Detection
    Chen, Liangzhe
    Cui, Xiaohui
    Li, Wei
    FOODS, 2021, 10 (10)
  • [25] Meta-Learning for Few-Shot Named Entity Recognition
    de Lichy, Cyprien
    Glaude, Hadrien
    Campbell, William
    1ST WORKSHOP ON META LEARNING AND ITS APPLICATIONS TO NATURAL LANGUAGE PROCESSING (METANLP 2021), 2021, : 44 - 58
  • [26] Meta-Learning for Few-Shot Time Series Classification
    Narwariya, Jyoti
    Malhotra, Pankaj
    Vig, Lovekesh
    Shroff, Gautam
    Vishnu, T. V.
    PROCEEDINGS OF THE 7TH ACM IKDD CODS AND 25TH COMAD (CODS-COMAD 2020), 2020, : 28 - 36
  • [27] Meta-Learning for Few-Shot Land Cover Classification
    Russwurm, Marc
    Wang, Sherrie
    Koerner, Marco
    Lobell, David
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 788 - 796
  • [28] META-LEARNING FOR FEW-SHOT TIME SERIES CLASSIFICATION
    Wang, Sherrie
    Russwurm, Marc
    Koerner, Marco
    Lobell, David B.
    IGARSS 2020 - 2020 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2020, : 7041 - 7044
  • [29] Know Where You're Going: Meta-Learning for Parameter-Efficient Fine-Tuning
    Gheini, Mozhdeh
    Ma, Xuezhe
    May, Jonathan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 11602 - 11612
  • [30] Adversarially Robust Few-Shot Learning: A Meta-Learning Approach
    Goldblum, Micah
    Fowl, Liam
    Goldstein, Tom
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS (NEURIPS 2020), 2020, 33