Fine-tuning Pre-trained Language Models for Few-shot Intent Detection: Supervised Pre-training and Isotropization

被引:0
|
作者
Zhang, Haode [1 ]
Liang, Haowen [1 ]
Zhang, Yuwei [2 ]
Zhan, Liming [1 ]
Wu, Xiao-Ming [1 ]
Lu, Xiaolei [3 ]
Lam, Albert Y. S. [4 ]
机构
[1] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China
[2] Univ Calif San Diego, La Jolla, CA 92093 USA
[3] Nanyang Technol Univ, Singapore, Singapore
[4] Fano Labs, Hong Kong, Peoples R China
关键词
REGRESSION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It is challenging to train a good intent classifier for a task-oriented dialogue system with only a few annotations. Recent studies have shown that fine-tuning pre-trained language models with a small amount of labeled utterances from public benchmarks in a supervised manner is extremely helpful. However, we find that supervised pre-training yields an anisotropic feature space, which may suppress the expressive power of the semantic representations. Inspired by recent research in isotropization, we propose to improve supervised pre-training by regularizing the feature space towards isotropy. We propose two regularizers based on contrastive learning and correlation matrix respectively, and demonstrate their effectiveness through extensive experiments. Our main finding is that it is promising to regularize supervised pre-training with isotropization to further improve the performance of few-shot intent detection.
引用
收藏
页码:532 / 542
页数:11
相关论文
共 50 条
  • [21] TOKEN Is a MASK: Few-shot Named Entity Recognition with Pre-trained Language Models
    Davody, Ali
    Adelani, David Ifeoluwa
    Kleinbauer, Thomas
    Klakow, Dietrich
    TEXT, SPEECH, AND DIALOGUE (TSD 2022), 2022, 13502 : 138 - 150
  • [22] Defending Pre-trained Language Models as Few-shot Learners against Backdoor Attacks
    Xi, Zhaohan
    Du, Tianyu
    Li, Changjiang
    Pang, Ren
    Ji, Shouling
    Chen, Jinghui
    Ma, Fenglong
    Wang, Ting
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [23] Pre-trained Vision and Language Transformers Are Few-Shot Incremental Learners
    Park, Keon-Hee
    Song, Kyungwoo
    Park, Gyeong-Moon
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 23881 - 23890
  • [24] Better Few-Shot Text Classification with Pre-trained Language Model
    Chen, Zheng
    Zhang, Yunchen
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT II, 2021, 12892 : 537 - 548
  • [25] CSS-LM: A Contrastive Framework for Semi-Supervised Fine-Tuning of Pre-Trained Language Models
    Su, Yusheng
    Han, Xu
    Lin, Yankai
    Zhang, Zhengyan
    Liu, Zhiyuan
    Li, Peng
    Zhou, Jie
    Sun, Maosong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2930 - 2941
  • [26] Is Fine-tuning Needed? Pre-trained Language Models Are Near Perfect for Out-of-Domain Detection
    Uppaal, Rheeya
    Hu, Junjie
    Li, Yixuan
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 12813 - 12832
  • [27] Towards Fine-tuning Pre-trained Language Models with Integer Forward and Backward Propagation
    Tayaranian, Mohammadreza
    Ghaffari, Alireza
    Tahaei, Marzieh S.
    Rezagholizadeh, Mehdi
    Asgharian, Masoud
    Nia, Vahid Partovi
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1912 - 1921
  • [28] Enhancing Machine-Generated Text Detection: Adversarial Fine-Tuning of Pre-Trained Language Models
    Hee Lee, Dong
    Jang, Beakcheol
    IEEE ACCESS, 2024, 12 : 65333 - 65340
  • [29] Efficient Fine-Tuning for Low-Resource Tibetan Pre-trained Language Models
    Zhou, Mingjun
    Daiqing, Zhuoma
    Qun, Nuo
    Nyima, Tashi
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT VII, 2024, 15022 : 410 - 422
  • [30] SPDF: Sparse Pre-training and Dense Fine-tuning for Large Language Models
    Thangarasa, Vithursan
    Gupta, Abhay
    Marshall, William
    Li, Tianda
    Leong, Kevin
    DeCoste, Dennis
    Lie, Sean
    Saxena, Shreyas
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 2134 - 2146