Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models

被引：0

作者：

Logan, Robert L. ^{[1
]}

Balazevic, Ivana ^{[2
,4
]}

Wallace, Eric ^{[3
]}

Petroni, Fabio ^{[4
]}

Singh, Sameer ^{[1
]}

Riedel, Sebastian ^{[4
,5
]}

机构：

[1] UC Irvine, Irvine, CA 92697 USA

[2] DeepMind, London, England

[3] Univ Calif Berkeley, Berkeley, CA USA

[4] Facebook AI Res, Menlo Pk, CA USA

[5] UCL, London, England

来源：

FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022) | 2022年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Prompting language models (LMs) with training examples and task descriptions has been seen as critical to recent successes in few-shot learning. In this work, we show that finetuning LMs in the few-shot setting can considerably reduce the need for prompt engineering. In fact, one can use null prompts, prompts that contain neither task-specific templates nor training examples, and achieve competitive accuracy to manually-tuned prompts across a wide range of tasks. While finetuning LMs does introduce new parameters for each downstream task, we show that this memory overhead can be substantially reduced-finetuning only the bias terms can achieve comparable or better accuracy than standard finetuning while only updating 0.1% of the parameters. All in all, we recommend finetuning LMs for few-shot learning as it is more accurate, has relatively stable performance across different prompts, and can be made nearly as efficient as using frozen LMs.

引用

页码：2824 / 2835

页数：12

共 50 条

[21] Inference Calibration of Vision-Language Foundation Models for Zero-Shot and Few-Shot Learning
Hu, Minyang
Chang, Hong
Shan, Shiguang
Chen, Xilin
PATTERN RECOGNITION LETTERS, 2025, 192 : 15 - 21
[22] Few-Shot Relation Extraction With Automatically Generated Prompts
Zhao, Xiaoyan
Yang, Min
Qu, Qiang
Xu, Ruifeng
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (03) : 4971 - 4983
[23] Few-Shot Relation Extraction With Automatically Generated Prompts
Zhao, Xiaoyan
Yang, Min
Qu, Qiang
Xu, Ruifeng
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, : 1 - 13
[24] Structural Supervision Improves Few-Shot Learning and Syntactic Generalization in Neural Language Models
Wilcox, Ethan
Qian, Peng
Futrell, Richard
Kohita, Ryosuke
Levy, Roger
Ballesteros, Miguel
PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 4640 - 4652
[25] RIFF: Learning to Rephrase Inputs for Few-shot Fine-tuning of Language Models
Najafi, Saeed
Fyshe, Alona
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 1447 - 1466
[26] JASMINE: Arabic GPT Models for Few-Shot Learning
Nagoudi, El Moatez Billah
Abdul-Mageed, Muhammad
Elmadany, AbdelRahim
Inciarte, Alcides Alcoba
Khondaker, Tawkat Islam
EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings, 2023, : 16721 - 16744
[27] JASMINE: Arabic GPT Models for Few-Shot Learning
Nagoudi, El Moatez Billah
Abdul-Mageed, Muhammad
Elmadany, AbdelRahim
Inciarte, Alcides Alcoba
Khondaker, Md Tawkat Islam
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 16721 - 16744
[28] Large Language Models for Few-Shot Automatic Term Extraction
Banerjee, Shubhanker
Chakravarthi, Bharathi Raja
McCrae, John Philip
NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PT I, NLDB 2024, 2024, 14762 : 137 - 150
[29] Large Language Models (LLMs) Enable Few-Shot Clustering
Vijay, Viswanathan
Kiril, Gashteovski
Carolin, Lawrence
Tongshuang, Wu
Graham, Neubig
NEC Technical Journal, 2024, 17 (02): : 80 - 90
[30] Unsupervised and few-shot parsing from pretrained language models
Zeng, Zhiyuan
Xiong, Deyi
ARTIFICIAL INTELLIGENCE, 2022, 305

← 1 2 3 4 5 →