Revisit Few-shot Intent Classification with PLMs: Direct Fine-tuning vs. Continual Pre-training

被引:0
|
作者
Zhang, Haode [1 ]
Liang, Haowen [1 ]
Zh, Liming [1 ]
Lam, Albert Y. S. [2 ]
Wu, Xiao-Ming [1 ]
机构
[1] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China
[2] Fano Labs, Hong Kong, Peoples R China
关键词
D O I
暂无
中图分类号
学科分类号
摘要
We consider the task of few-shot intent detection, which involves training a deep learning model to classify utterances based on their underlying intents using only a small amount of labeled data. The current approach to address this problem is through continual pre-training, i.e., fine-tuning pre-trained language models (PLMs) on external resources (e.g., conversational corpora, public intent detection datasets, or natural language understanding datasets) before using them as utterance encoders for training an intent classifier. In this paper, we show that continual pre-training may not be essential, since the overfitting problem of PLMs on this task may not be as serious as expected. Specifically, we find that directly fine-tuning PLMs on only a handful of labeled examples already yields decent results compared to methods that employ continual pre-training, and the performance gap diminishes rapidly as the number of labeled data increases. To maximize the utilization of the limited available data, we propose a context augmentation method and leverage sequential self-distillation to boost performance. Comprehensive experiments on real-world benchmarks show that given only two or more labeled samples per class, direct fine-tuning outperforms many strong baselines that utilize external data sources for continual pre-training. The code can be found at https://github.com/hdzhang-code/DFTPlus.
引用
收藏
页码:11105 / 11119
页数:15
相关论文
共 50 条
  • [1] Few-Shot Intent Detection via Contrastive Pre-Training and Fine-Tuning
    Zhang, Jian-Guo
    Bui, Trung
    Yoon, Seunghyun
    Chen, Xiang
    Liu, Zhiwei
    Xia, Congying
    Tran, Quan Hung
    Chang, Walter
    Yu, Philip
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 1906 - 1912
  • [2] Effectiveness of Pre-training for Few-shot Intent Classification
    Zhang, Haode
    Zhang, Yuwei
    Zhan, Li-Ming
    Chen, Jiaxin
    Shi, Guangyuan
    Wu, Xiao-Ming
    Lam, Albert Y. S.
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1114 - 1120
  • [3] Fine-tuning Pre-trained Language Models for Few-shot Intent Detection: Supervised Pre-training and Isotropization
    Zhang, Haode
    Liang, Haowen
    Zhang, Yuwei
    Zhan, Liming
    Wu, Xiao-Ming
    Lu, Xiaolei
    Lam, Albert Y. S.
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 532 - 542
  • [4] Improving Pre-Training and Fine-Tuning for Few-Shot SAR Automatic Target Recognition
    Zhang, Chao
    Dong, Hongbin
    Deng, Baosong
    REMOTE SENSING, 2023, 15 (06)
  • [5] Pre-training Intent-Aware Encoders for Zero- and Few-Shot Intent Classification
    Sung, Mujeen
    Gun, James
    Mansimov, Elman
    Pappas, Nikolaos
    Shu, Raphael
    Romeo, Salvatore
    Zhang, Yi
    Castelli, Vittorio
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 10433 - 10442
  • [6] Hybrid Fine-Tuning Strategy for Few-Shot Classification
    Zhao, Lei
    Ou, Zhonghua
    Zhang, Lixun
    Li, Shuxiao
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [7] Few-shot Fine-tuning vs. In-context Learning: A Fair Comparison and Evaluation
    Mosbach, Marius
    Pimentel, Tiago
    Ravfogel, Shauli
    Klakow, Dietrich
    Elazar, Yanai
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 12284 - 12314
  • [8] Fine-Tuning for Few-Shot Image Classification by Multimodal Prototype Regularization
    Wu, Qianhao
    Qi, Jiaxin
    Zhang, Dong
    Zhang, Hanwang
    Tang, Jinhui
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 8543 - 8556
  • [9] Label Semantic Aware Pre-training for Few-shot Text Classification
    Mueller, Aaron
    Krone, Jason
    Romeo, Salvatore
    Mansour, Saab
    Mansimov, Elman
    Zhang, Yi
    Roth, Dan
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 8318 - 8334
  • [10] Pathologies of Pre-trained Language Models in Few-shot Fine-tuning
    Chen, Hanjie
    Zheng, Guoqing
    Awadallah, Ahmed Hassan
    Ji, Yangfeng
    PROCEEDINGS OF THE THIRD WORKSHOP ON INSIGHTS FROM NEGATIVE RESULTS IN NLP (INSIGHTS 2022), 2022, : 144 - 153