Active Learning Principles for In-Context Learning with Large Language Models

被引:0
|
作者
Margatina, Katerina [1 ,2 ]
Schick, Timo [2 ]
Aletras, Nikolaos [1 ]
Dwivedi-Yu, Jane [2 ]
机构
[1] Univ Sheffield, Sheffield, S Yorkshire, England
[2] Meta, FAIR, Menlo Pk, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The remarkable advancements in large language models (LLMs) have significantly enhanced predictive performance in few-shot learning settings. By using only a small number of labeled examples, referred to as demonstrations, LLMs can effectively perform the task at hand through in-context learning. However, the process of selecting demonstrations for maximizing performance has received limited attention in prior work. This paper addresses the issue of identifying the most informative demonstrations for few-shot learning by approaching it as a pool-based Active Learning (AL) problem over a single iteration. We compare standard AL algorithms based on uncertainty, diversity, and similarity, and consistently observe that the latter outperforms all other methods, including random sampling. Our extensive experimentation involving a diverse range of GPT and OPT models across 24 classification and multi-choice tasks, coupled with thorough analysis, unambiguously demonstrates the importance of using demonstrations that are semantically similar to the domain of the test examples. In fact, we show higher average classification performance using "similar" demonstrations with GPT-2 (124M) than random demonstrations with GPT-Neox (20B). Notably, while diversity sampling shows promise, uncertainty sampling, despite its success in conventional supervised learning AL scenarios, performs poorly in in-context learning.
引用
收藏
页码:5011 / 5034
页数:24
相关论文
共 50 条
  • [31] In-Context Impersonation Reveals Large Language Models' Strengths and Biases
    Salewski, Leonard
    Alaniz, Stephan
    Rio-Torto, Isabel
    Schulz, Eric
    Akata, Zeynep
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [32] ScoNe: Benchmarking Negation Reasoning in Language Models With Fine-Tuning and In-Context Learning
    She, Jingyuan Selena
    Potts, Christopher
    Bowman, Samuel R.
    Geiger, Atticus
    61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1803 - 1821
  • [33] What In-Context Learning "Learns" In-Context: Disentangling Task Recognition and Task Learning
    Pan, Jane
    Gao, Tianyu
    Chen, Howard
    Chen, Danqi
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 8298 - 8319
  • [34] Enhancing In-Context Learning of Large Language Models for Knowledge Graph Reasoning via Rule-and-Reinforce Selected Triples
    Wang, Shaofei
    APPLIED SCIENCES-BASEL, 2025, 15 (03):
  • [35] Structured State Space Models for In-Context Reinforcement Learning
    Lu, Chris
    Schroecker, Yannick
    Gu, Albert
    Parisotto, Emilio
    Foerster, Jakob
    Singh, Satinder
    Behbahani, Feryal
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [36] Meta-learning via Language Model In-context Tuning
    Chen, Yanda
    Zhong, Ruiqi
    Zha, Sheng
    Karypis, George
    He, He
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 719 - 730
  • [37] Diluie: constructing diverse demonstrations of in-context learning with large language model for unified information extraction
    Guo Q.
    Guo Y.
    Zhao J.
    Neural Computing and Applications, 2024, 36 (22) : 13491 - 13512
  • [38] Cultural Understanding Using In-context Learning and Masked Language Modeling
    Qian, Ming
    Newton, Charles
    Qian, Davis
    HCI INTERNATIONAL 2021 - LATE BREAKING PAPERS: MULTIMODALITY, EXTENDED REALITY, AND ARTIFICIAL INTELLIGENCE, 2021, 13095 : 500 - 508
  • [39] Cascade Large Language Model via In-Context Learning for Depression Detection on Chinese Social Media
    Zheng, Tong
    Guo, Yanrong
    Hong, Richang
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT 1, 2025, 15031 : 353 - 366
  • [40] Large Language Model Cascades and Persona-Based In-Context Learning for Multilingual Sexism Detection
    Tian, Lin
    Huang, Nannan
    Zhang, Xiuzhen
    EXPERIMENTAL IR MEETS MULTILINGUALITY, MULTIMODALITY, AND INTERACTION, PT I, CLEF 2024, 2024, 14958 : 254 - 265