Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models

被引:0
|
作者
Logan, Robert L. [1 ]
Balazevic, Ivana [2 ,4 ]
Wallace, Eric [3 ]
Petroni, Fabio [4 ]
Singh, Sameer [1 ]
Riedel, Sebastian [4 ,5 ]
机构
[1] UC Irvine, Irvine, CA 92697 USA
[2] DeepMind, London, England
[3] Univ Calif Berkeley, Berkeley, CA USA
[4] Facebook AI Res, Menlo Pk, CA USA
[5] UCL, London, England
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Prompting language models (LMs) with training examples and task descriptions has been seen as critical to recent successes in few-shot learning. In this work, we show that finetuning LMs in the few-shot setting can considerably reduce the need for prompt engineering. In fact, one can use null prompts, prompts that contain neither task-specific templates nor training examples, and achieve competitive accuracy to manually-tuned prompts across a wide range of tasks. While finetuning LMs does introduce new parameters for each downstream task, we show that this memory overhead can be substantially reduced-finetuning only the bias terms can achieve comparable or better accuracy than standard finetuning while only updating 0.1% of the parameters. All in all, we recommend finetuning LMs for few-shot learning as it is more accurate, has relatively stable performance across different prompts, and can be made nearly as efficient as using frozen LMs.
引用
收藏
页码:2824 / 2835
页数:12
相关论文
共 50 条
  • [21] Inference Calibration of Vision-Language Foundation Models for Zero-Shot and Few-Shot Learning
    Hu, Minyang
    Chang, Hong
    Shan, Shiguang
    Chen, Xilin
    PATTERN RECOGNITION LETTERS, 2025, 192 : 15 - 21
  • [22] Few-Shot Relation Extraction With Automatically Generated Prompts
    Zhao, Xiaoyan
    Yang, Min
    Qu, Qiang
    Xu, Ruifeng
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (03) : 4971 - 4983
  • [23] Few-Shot Relation Extraction With Automatically Generated Prompts
    Zhao, Xiaoyan
    Yang, Min
    Qu, Qiang
    Xu, Ruifeng
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, : 1 - 13
  • [24] Structural Supervision Improves Few-Shot Learning and Syntactic Generalization in Neural Language Models
    Wilcox, Ethan
    Qian, Peng
    Futrell, Richard
    Kohita, Ryosuke
    Levy, Roger
    Ballesteros, Miguel
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 4640 - 4652
  • [25] RIFF: Learning to Rephrase Inputs for Few-shot Fine-tuning of Language Models
    Najafi, Saeed
    Fyshe, Alona
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 1447 - 1466
  • [26] JASMINE: Arabic GPT Models for Few-Shot Learning
    Nagoudi, El Moatez Billah
    Abdul-Mageed, Muhammad
    Elmadany, AbdelRahim
    Inciarte, Alcides Alcoba
    Khondaker, Tawkat Islam
    EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings, 2023, : 16721 - 16744
  • [27] JASMINE: Arabic GPT Models for Few-Shot Learning
    Nagoudi, El Moatez Billah
    Abdul-Mageed, Muhammad
    Elmadany, AbdelRahim
    Inciarte, Alcides Alcoba
    Khondaker, Md Tawkat Islam
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 16721 - 16744
  • [28] Large Language Models for Few-Shot Automatic Term Extraction
    Banerjee, Shubhanker
    Chakravarthi, Bharathi Raja
    McCrae, John Philip
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PT I, NLDB 2024, 2024, 14762 : 137 - 150
  • [29] Large Language Models (LLMs) Enable Few-Shot Clustering
    Vijay, Viswanathan
    Kiril, Gashteovski
    Carolin, Lawrence
    Tongshuang, Wu
    Graham, Neubig
    NEC Technical Journal, 2024, 17 (02): : 80 - 90
  • [30] Unsupervised and few-shot parsing from pretrained language models
    Zeng, Zhiyuan
    Xiong, Deyi
    ARTIFICIAL INTELLIGENCE, 2022, 305