Active Learning Principles for In-Context Learning with Large Language Models

被引:0
|
作者
Margatina, Katerina [1 ,2 ]
Schick, Timo [2 ]
Aletras, Nikolaos [1 ]
Dwivedi-Yu, Jane [2 ]
机构
[1] Univ Sheffield, Sheffield, S Yorkshire, England
[2] Meta, FAIR, Menlo Pk, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The remarkable advancements in large language models (LLMs) have significantly enhanced predictive performance in few-shot learning settings. By using only a small number of labeled examples, referred to as demonstrations, LLMs can effectively perform the task at hand through in-context learning. However, the process of selecting demonstrations for maximizing performance has received limited attention in prior work. This paper addresses the issue of identifying the most informative demonstrations for few-shot learning by approaching it as a pool-based Active Learning (AL) problem over a single iteration. We compare standard AL algorithms based on uncertainty, diversity, and similarity, and consistently observe that the latter outperforms all other methods, including random sampling. Our extensive experimentation involving a diverse range of GPT and OPT models across 24 classification and multi-choice tasks, coupled with thorough analysis, unambiguously demonstrates the importance of using demonstrations that are semantically similar to the domain of the test examples. In fact, we show higher average classification performance using "similar" demonstrations with GPT-2 (124M) than random demonstrations with GPT-Neox (20B). Notably, while diversity sampling shows promise, uncertainty sampling, despite its success in conventional supervised learning AL scenarios, performs poorly in in-context learning.
引用
收藏
页码:5011 / 5034
页数:24
相关论文
共 50 条
  • [41] In-context learning of state estimators
    Busetto, R.
    Breschi, V.
    Forgione, M.
    Piga, D.
    Formentin, S.
    IFAC PAPERSONLINE, 2024, 58 (15): : 145 - 150
  • [42] Active in-context learning for cross-domain entity resolution
    Zhang, Ziheng
    Zeng, Weixin
    Tang, Jiuyang
    Huang, Hongbin
    Zhao, Xiang
    INFORMATION FUSION, 2025, 117
  • [43] Generative Calibration for In-context Learning
    Jiang, Zhongtao
    Zhang, Yuanzhe
    Liu, Cao
    Zhao, Jun
    Liu, Kang
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 2312 - 2333
  • [44] Learning In-context Learning for Named Entity Recognition
    Chen, Jiawei
    Lu, Yaojie
    Lin, Hongyu
    Lou, Jie
    Jia, Wei
    Dai, Dai
    Wu, Hua
    Cao, Boxi
    Han, Xianpei
    Sun, Le
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 13661 - 13675
  • [45] Distinguishability Calibration to In-Context Learning
    Li, Hongjing
    Yan, Hanqi
    Li, Yanran
    Qian, Li
    He, Yulan
    Gui, Lin
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1385 - 1397
  • [46] Transformers as Statisticians: Provable In-Context Learning with In-Context Algorithm Selection
    Bai, Yu
    Chen, Fan
    Wang, Huan
    Xiong, Caiming
    Mei, Song
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [47] Requirements Satisfiability with In-Context Learning
    Santos, Sarah
    Breaux, Travis
    Norton, Thomas
    Haghighi, Sara
    Ghanavati, Sepideh
    32ND IEEE INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE, RE 2024, 2024, : 168 - 179
  • [48] Is Mamba Capable of In-Context Learning?
    Grazzi, Riccardo
    Siems, Julien
    Schrodi, Simon
    Brox, Thomas
    Hutter, Frank
    INTERNATIONAL CONFERENCE ON AUTOMATED MACHINE LEARNING, 2024, 256
  • [49] In-Context Symbolic Regression: Leveraging Large Language Models for Function Discovery
    Merler, Matteo
    Haitsiukevich, Katsiaryna
    Dainese, Nicola
    Marttinen, Pekka
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 4: STUDENT RESEARCH WORKSHOP, 2024, : 445 - 462
  • [50] Stabilized In-Context Learning with Pre-trained Language Models for Few Shot Dialogue State Tracking
    Chen, Derek
    Qian, Kun
    Yu, Zhou
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1551 - 1564