Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models

被引:0
|
作者
Logan, Robert L. [1 ]
Balazevic, Ivana [2 ,4 ]
Wallace, Eric [3 ]
Petroni, Fabio [4 ]
Singh, Sameer [1 ]
Riedel, Sebastian [4 ,5 ]
机构
[1] UC Irvine, Irvine, CA 92697 USA
[2] DeepMind, London, England
[3] Univ Calif Berkeley, Berkeley, CA USA
[4] Facebook AI Res, Menlo Pk, CA USA
[5] UCL, London, England
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Prompting language models (LMs) with training examples and task descriptions has been seen as critical to recent successes in few-shot learning. In this work, we show that finetuning LMs in the few-shot setting can considerably reduce the need for prompt engineering. In fact, one can use null prompts, prompts that contain neither task-specific templates nor training examples, and achieve competitive accuracy to manually-tuned prompts across a wide range of tasks. While finetuning LMs does introduce new parameters for each downstream task, we show that this memory overhead can be substantially reduced-finetuning only the bias terms can achieve comparable or better accuracy than standard finetuning while only updating 0.1% of the parameters. All in all, we recommend finetuning LMs for few-shot learning as it is more accurate, has relatively stable performance across different prompts, and can be made nearly as efficient as using frozen LMs.
引用
收藏
页码:2824 / 2835
页数:12
相关论文
共 50 条
  • [1] True Few-Shot Learning with Language Models
    Perez, Ethan
    Kiela, Douwe
    Cho, Kyunghyun
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [2] Multimodal Few-Shot Learning with Frozen Language Models
    Tsimpoukelli, Maria
    Menick, Jacob
    Cabi, Serkan
    Eslami, S. M. Ali
    Vinyals, Oriol
    Hill, Felix
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [3] Language Models are Few-Shot Butlers
    Micheli, Vincent
    Fleuret, Francois
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9312 - 9318
  • [4] Language Models are Few-Shot Learners
    Brown, Tom B.
    Mann, Benjamin
    Ryder, Nick
    Subbiah, Melanie
    Kaplan, Jared
    Dhariwal, Prafulla
    Neelakantan, Arvind
    Shyam, Pranav
    Sastry, Girish
    Askell, Amanda
    Agarwal, Sandhini
    Herbert-Voss, Ariel
    Krueger, Gretchen
    Henighan, Tom
    Child, Rewon
    Ramesh, Aditya
    Ziegler, Daniel M.
    Wu, Jeffrey
    Winter, Clemens
    Hesse, Christopher
    Chen, Mark
    Sigler, Eric
    Litwin, Mateusz
    Gray, Scott
    Chess, Benjamin
    Clark, Jack
    Berner, Christopher
    McCandlish, Sam
    Radford, Alec
    Sutskever, Ilya
    Amodei, Dario
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [5] ATLAS: Few-shot Learning with Retrieval Augmented Language Models
    Izacard, Gautier
    Lewis, Patrick
    Lomeli, Maria
    Hosseini, Lucas
    Petroni, Fabio
    Schick, Timo
    Dwivedi-Yu, Jane
    Joulin, Armand
    Riedel, Sebastian
    Grave, Edouard
    JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [6] Learning Meta Soft Prompt for Few-Shot Language Models
    Chien, Jen-Tzung
    Chen, Ming-Yen
    Xue, Jing-Hao
    2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 57 - 62
  • [7] Few-shot Unified Question Answering: Tuning Models or Prompts?
    Bansal, Srijan
    Yavuz, Semih
    Pang, Bo
    Bhat, Meghana
    Zhou, Yingbo
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 8200 - 8220
  • [8] Automating MedSAM by Learning Prompts with Weak Few-Shot Supervision
    Gaillochet, Melanie
    Desrosiers, Christian
    Lombaert, Herve
    FOUNDATION MODELS FOR GENERAL MEDICAL AI, MEDAGI 2024, 2025, 15184 : 61 - 70
  • [9] Few-shot Subgoal Planning with Language Models
    Logeswaran, Lajanugen
    Fu, Yao
    Lee, Moontae
    Lee, Honglak
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 5493 - 5506
  • [10] Prompts in Few-Shot Named Entity Recognition
    Rozhkov, I. S.
    Loukachevitch, N. V.
    PATTERN RECOGNITION AND IMAGE ANALYSIS, 2023, 33 (02) : 122 - 131