Distinguishability Calibration to In-Context Learning

被引:0
|
作者
Li, Hongjing [1 ]
Yan, Hanqi [1 ]
Li, Yanran
Qian, Li [2 ]
He, Yulan [1 ,3 ,4 ]
Gui, Lin [3 ]
机构
[1] Univ Warwick, Dept Comp Sci, Coventry, W Midlands, England
[2] Xiaomi AI Lab, Beijing, Peoples R China
[3] Kings Coll London, Dept Informat, London, England
[4] Alan Turing Inst, London, England
基金
英国工程与自然科学研究理事会; 美国国家科学基金会; 英国科研创新办公室;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent years have witnessed increasing interests in prompt-based learning in which models can be trained on only a few annotated instances, making them suitable in low-resource settings. When using prompt-based learning for text classification, the goal is to use a pretrained language model (PLM) to predict a missing token in a pre-defined template given an input text, which can be mapped to a class label. However, PLMs built on the transformer architecture tend to generate similar output embeddings, making it difficult to discriminate between different class labels. The problem is further exacerbated when dealing with classification tasks involving many fine-grained class labels. In this work, we alleviate this information diffusion issue, i.e., different tokens share a large proportion of similar information after going through stacked multiple self-attention layers in a transformer, by proposing a calibration method built on feature transformations through rotation and scaling to map a PLM-encoded embedding into a new metric space to guarantee the distinguishability of the resulting embeddings. Furthermore, we take the advantage of hyperbolic embeddings to capture the hierarchical relations among fine-grained classassociated token embedding by a coarse-to-fine metric learning strategy to enhance the distinguishability of the learned output embeddings. Extensive experiments on the three datasets under various settings demonstrate the effectiveness of our approach. (1)
引用
收藏
页码:1385 / 1397
页数:13
相关论文
共 50 条
  • [21] Mitigating Label Biases for In-context Learning
    Fei, Yu
    Hou, Yifan
    Chen, Zeming
    Bosselut, Antoine
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 14014 - 14031
  • [22] Self-Adaptive In-Context Learning: An Information Compression Perspective for In-Context Example Selection and Ordering
    Wu, Zhiyong
    Wang, Yaoxiang
    Ye, Jiacheng
    Kong, Lingpeng
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 1423 - 1436
  • [23] Robustness of Named Entity Replacements for In-Context Learning
    Goodarzi, Saeed
    Kagita, Nikhil
    Minn, Dennis
    Wang, Shufan
    Dessi, Roberto
    Toshniwal, Shubham
    Williams, Adina
    Lanchantin, Jack
    Sinha, Koustuv
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 10914 - 10931
  • [24] Prompt Optimization via Adversarial In-Context Learning
    Do, Xuan Long
    Zhao, Yiran
    Brown, Hannah
    Xie, Yuxi
    Zhao, James Xu
    Chen, Nancy F.
    Kawaguchi, Kenji
    Shieh, Michael
    He, Junxian
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 7308 - 7327
  • [25] The Transient Nature of Emergent In-Context Learning in Transformers
    Singh, Aaditya K.
    Chan, Stephanie C. Y.
    Moskovitz, Ted
    Grant, Erin
    Saxe, Andrew M.
    Hill, Felix
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [26] PRODIGY: Enabling In-context Learning Over Graphs
    Huang, Qian
    Ren, Hongyu
    Chen, Peng
    Krzmanc, Gregor
    Zeng, Daniel
    Liang, Percy
    Leskovec, Jure
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [27] Using In-Context Learning to Improve Dialogue Safety
    Meade, Nicholas
    Gella, Spandana
    Hazarika, Devamanyu
    Gupta, Prakhar
    Jin, Di
    Reddy, Siva
    Liu, Yang
    Hakkani-Tur, Dilek
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 11882 - 11910
  • [28] On the Relation between Sensitivity and Accuracy in In-Context Learning
    Chen, Yanda
    Zhao, Chen
    Yu, Zhou
    McKeown, Kathleen
    He, He
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 155 - 167
  • [29] Active Learning Principles for In-Context Learning with Large Language Models
    Margatina, Katerina
    Schick, Timo
    Aletras, Nikolaos
    Dwivedi-Yu, Jane
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 5011 - 5034
  • [30] Schema-learning and rebinding as mechanisms of in-context learning and emergence
    Swaminathan, Sivaramakrishnan
    Dedieu, Antoine
    Raju, Rajkumar Vasudeva
    Shanahan, Murray
    Lazaro-Gredilla, Miguel
    George, Dileep
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,