Fine-tuning Pre-trained Language Models for Few-shot Intent Detection: Supervised Pre-training and Isotropization

被引:0
|
作者
Zhang, Haode [1 ]
Liang, Haowen [1 ]
Zhang, Yuwei [2 ]
Zhan, Liming [1 ]
Wu, Xiao-Ming [1 ]
Lu, Xiaolei [3 ]
Lam, Albert Y. S. [4 ]
机构
[1] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China
[2] Univ Calif San Diego, La Jolla, CA 92093 USA
[3] Nanyang Technol Univ, Singapore, Singapore
[4] Fano Labs, Hong Kong, Peoples R China
关键词
REGRESSION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It is challenging to train a good intent classifier for a task-oriented dialogue system with only a few annotations. Recent studies have shown that fine-tuning pre-trained language models with a small amount of labeled utterances from public benchmarks in a supervised manner is extremely helpful. However, we find that supervised pre-training yields an anisotropic feature space, which may suppress the expressive power of the semantic representations. Inspired by recent research in isotropization, we propose to improve supervised pre-training by regularizing the feature space towards isotropy. We propose two regularizers based on contrastive learning and correlation matrix respectively, and demonstrate their effectiveness through extensive experiments. Our main finding is that it is promising to regularize supervised pre-training with isotropization to further improve the performance of few-shot intent detection.
引用
收藏
页码:532 / 542
页数:11
相关论文
共 50 条
  • [11] Few-Shot NLG with Pre-Trained Language Model
    Chen, Zhiyu
    Eavani, Harini
    Chen, Wenhu
    Liu, Yinyin
    Wang, William Yang
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 183 - 190
  • [12] PPT: Pre-trained Prompt Tuning for Few-shot Learning
    Gu, Yuxian
    Han, Xu
    Liu, Zhiyuan
    Huang, Minlie
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 8410 - 8423
  • [13] Debiasing Pre-Trained Language Models via Efficient Fine-Tuning
    Gira, Michael
    Zhang, Ruisu
    Lee, Kangwook
    PROCEEDINGS OF THE SECOND WORKSHOP ON LANGUAGE TECHNOLOGY FOR EQUALITY, DIVERSITY AND INCLUSION (LTEDI 2022), 2022, : 59 - 69
  • [14] Disfluencies and Fine-Tuning Pre-trained Language Models for Detection of Alzheimer's Disease
    Yuan, Jiahong
    Bian, Yuchen
    Cai, Xingyu
    Huang, Jiaji
    Ye, Zheng
    Church, Kenneth
    INTERSPEECH 2020, 2020, : 2162 - 2166
  • [15] Pruning Pre-trained Language ModelsWithout Fine-Tuning
    Jiang, Ting
    Wang, Deqing
    Zhuang, Fuzhen
    Xie, Ruobing
    Xia, Feng
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 594 - 605
  • [16] SFMD: A Semi-supervised Framework for Pre-trained Language Models Fine-Tuning with Noisy Samples
    Yang, Yiwen
    Duan, Pengfei
    Li, Yongbing
    Zhang, Yifang
    Xiong, Shengwu
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT III, ICIC 2024, 2024, 14877 : 316 - 328
  • [17] Gender-tuning: Empowering Fine-tuning for Debiasing Pre-trained Language Models
    Ghanbarzadeh, Somayeh
    Huang, Yan
    Palangi, Hamid
    Moreno, Radames Cruz
    Khanpour, Hamed
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 5448 - 5458
  • [18] Revisiting k-NN for Fine-Tuning Pre-trained Language Models
    Li, Lei
    Chen, Jing
    Tian, Botzhong
    Zhang, Ningyu
    CHINESE COMPUTATIONAL LINGUISTICS, CCL 2023, 2023, 14232 : 327 - 338
  • [19] Fine-Tuning Pre-Trained Language Models Effectively by Optimizing Subnetworks Adaptively
    Zhang, Haojie
    Li, Ge
    Li, Jia
    Zhang, Zhongjin
    Zhu, Yuqi
    Jin, Zhi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [20] An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models
    Liu, Xueqing
    Wang, Chi
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 2286 - 2300