Revisit Few-shot Intent Classification with PLMs: Direct Fine-tuning vs. Continual Pre-training

被引:0
|
作者
Zhang, Haode [1 ]
Liang, Haowen [1 ]
Zh, Liming [1 ]
Lam, Albert Y. S. [2 ]
Wu, Xiao-Ming [1 ]
机构
[1] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China
[2] Fano Labs, Hong Kong, Peoples R China
关键词
D O I
暂无
中图分类号
学科分类号
摘要
We consider the task of few-shot intent detection, which involves training a deep learning model to classify utterances based on their underlying intents using only a small amount of labeled data. The current approach to address this problem is through continual pre-training, i.e., fine-tuning pre-trained language models (PLMs) on external resources (e.g., conversational corpora, public intent detection datasets, or natural language understanding datasets) before using them as utterance encoders for training an intent classifier. In this paper, we show that continual pre-training may not be essential, since the overfitting problem of PLMs on this task may not be as serious as expected. Specifically, we find that directly fine-tuning PLMs on only a handful of labeled examples already yields decent results compared to methods that employ continual pre-training, and the performance gap diminishes rapidly as the number of labeled data increases. To maximize the utilization of the limited available data, we propose a context augmentation method and leverage sequential self-distillation to boost performance. Comprehensive experiments on real-world benchmarks show that given only two or more labeled samples per class, direct fine-tuning outperforms many strong baselines that utilize external data sources for continual pre-training. The code can be found at https://github.com/hdzhang-code/DFTPlus.
引用
收藏
页码:11105 / 11119
页数:15
相关论文
共 50 条
  • [21] Strong Baselines for Parameter-Efficient Few-Shot Fine-Tuning
    Basu, Samyadeep
    Hu, Shell
    Massiceti, Daniela
    Feizi, Soheil
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 10, 2024, : 11024 - 11031
  • [22] AirDet: Few-Shot Detection without Fine-tuning for Autonomous Exploration
    Li, Bowen
    Wang, Chen
    Reddy, Pranay
    Kim, Seungchan
    Scherer, Sebastian
    arXiv, 2021,
  • [23] Enhancing Few-Shot CLIP With Semantic-Aware Fine-Tuning
    Zhu, Yao
    Chen, Yuefeng
    Mao, Xiaofeng
    Yan, Xiu
    Wang, Yue
    Lu, Wang
    Wang, Jindong
    Ji, Xiangyang
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
  • [24] Few-Shot Fine-Tuning SOTA Summarization Models for Medical Dialogues
    Navarro, David Fraile
    Dras, Mark
    Berkovsky, Shlomo
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2022, : 254 - 266
  • [25] Few-shot fine-tuning with auxiliary tasks for video anomaly detection
    Lv, Jing
    Liu, Zhi
    Li, Gongyang
    MULTIMEDIA SYSTEMS, 2025, 31 (02)
  • [26] CODE: Contrastive Pre-training with Adversarial Fine-Tuning for Zero-Shot Expert Linking
    Chen, Bo
    Zhang, Jing
    Zhang, Xiaokang
    Tang, Xiaobin
    Cai, Lingfan
    Chen, Hong
    Li, Cuiping
    Zhang, Peng
    Tang, Jie
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 11846 - 11854
  • [27] The Devil is in the Details: On Models and Training Regimes for Few-Shot Intent Classification
    Mesgar, Mohsen
    Thy Thy Tran
    Glavas, Goran
    Gurevych, Iryna
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1846 - 1857
  • [28] Multitask Pre-training of Modular Prompt for Chinese Few-Shot Learning
    Sun, Tianxiang
    He, Zhengfu
    Zhu, Qin
    Qiu, Xipeng
    Huang, Xuanjing
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 11156 - 11172
  • [29] SAR-HUB: Pre-Training, Fine-Tuning, and Explaining
    Yang, Haodong
    Kang, Xinyue
    Liu, Long
    Liu, Yujiang
    Huang, Zhongling
    REMOTE SENSING, 2023, 15 (23)
  • [30] Synergistic Anchored Contrastive Pre-training for Few-Shot Relation Extraction
    Luo, Da
    Gan, Yanglei
    Hou, Rui
    Lin, Run
    Liu, Qiao
    Cai, Yuxiang
    Gao, Wannian
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 17, 2024, : 18742 - 18750