Distinguishability Calibration to In-Context Learning

被引:0
|
作者
Li, Hongjing [1 ]
Yan, Hanqi [1 ]
Li, Yanran
Qian, Li [2 ]
He, Yulan [1 ,3 ,4 ]
Gui, Lin [3 ]
机构
[1] Univ Warwick, Dept Comp Sci, Coventry, W Midlands, England
[2] Xiaomi AI Lab, Beijing, Peoples R China
[3] Kings Coll London, Dept Informat, London, England
[4] Alan Turing Inst, London, England
基金
英国工程与自然科学研究理事会; 美国国家科学基金会; 英国科研创新办公室;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent years have witnessed increasing interests in prompt-based learning in which models can be trained on only a few annotated instances, making them suitable in low-resource settings. When using prompt-based learning for text classification, the goal is to use a pretrained language model (PLM) to predict a missing token in a pre-defined template given an input text, which can be mapped to a class label. However, PLMs built on the transformer architecture tend to generate similar output embeddings, making it difficult to discriminate between different class labels. The problem is further exacerbated when dealing with classification tasks involving many fine-grained class labels. In this work, we alleviate this information diffusion issue, i.e., different tokens share a large proportion of similar information after going through stacked multiple self-attention layers in a transformer, by proposing a calibration method built on feature transformations through rotation and scaling to map a PLM-encoded embedding into a new metric space to guarantee the distinguishability of the resulting embeddings. Furthermore, we take the advantage of hyperbolic embeddings to capture the hierarchical relations among fine-grained classassociated token embedding by a coarse-to-fine metric learning strategy to enhance the distinguishability of the learned output embeddings. Extensive experiments on the three datasets under various settings demonstrate the effectiveness of our approach. (1)
引用
收藏
页码:1385 / 1397
页数:13
相关论文
共 50 条
  • [1] Generative Calibration for In-context Learning
    Jiang, Zhongtao
    Zhang, Yuanzhe
    Liu, Cao
    Zhao, Jun
    Liu, Kang
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 2312 - 2333
  • [2] Enhancing In-context Learning via Linear Probe Calibration
    Abbas, Momin
    Zhou, Yi
    Ram, Parikshit
    Baracaldo, Nathalie
    Samulowitz, Horst
    Salonidis, Theodoros
    Chen, Tianyi
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [3] In-Context In-Context Learning with Transformer Neural Processes
    Ashman, Matthew
    Diaconu, Cristiana
    Weller, Adrian
    Turner, Richard E.
    SYMPOSIUM ON ADVANCES IN APPROXIMATE BAYESIAN INFERENCE, 2024, 253 : 1 - 29
  • [4] A glance at in-context learning
    Wu, Yongliang
    Yang, Xu
    FRONTIERS OF COMPUTER SCIENCE, 2024, 18 (05)
  • [5] The Learnability of In-Context Learning
    Wies, Noam
    Levine, Yoav
    Shashua, Amnon
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [6] Transformers as Statisticians: Provable In-Context Learning with In-Context Algorithm Selection
    Bai, Yu
    Chen, Fan
    Wang, Huan
    Xiong, Caiming
    Mei, Song
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [7] What In-Context Learning "Learns" In-Context: Disentangling Task Recognition and Task Learning
    Pan, Jane
    Gao, Tianyu
    Chen, Howard
    Chen, Danqi
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 8298 - 8319
  • [8] Learning To Retrieve Prompts for In-Context Learning
    Rubin, Ohad
    Herzig, Jonathan
    Berant, Jonathan
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 2655 - 2671
  • [9] In-context learning of state estimators
    Busetto, R.
    Breschi, V.
    Forgione, M.
    Piga, D.
    Formentin, S.
    IFAC PAPERSONLINE, 2024, 58 (15): : 145 - 150
  • [10] Requirements Satisfiability with In-Context Learning
    Santos, Sarah
    Breaux, Travis
    Norton, Thomas
    Haghighi, Sara
    Ghanavati, Sepideh
    32ND IEEE INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE, RE 2024, 2024, : 168 - 179