Revisit Few-shot Intent Classification with PLMs: Direct Fine-tuning vs. Continual Pre-training

被引:0
|
作者
Zhang, Haode [1 ]
Liang, Haowen [1 ]
Zh, Liming [1 ]
Lam, Albert Y. S. [2 ]
Wu, Xiao-Ming [1 ]
机构
[1] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China
[2] Fano Labs, Hong Kong, Peoples R China
关键词
D O I
暂无
中图分类号
学科分类号
摘要
We consider the task of few-shot intent detection, which involves training a deep learning model to classify utterances based on their underlying intents using only a small amount of labeled data. The current approach to address this problem is through continual pre-training, i.e., fine-tuning pre-trained language models (PLMs) on external resources (e.g., conversational corpora, public intent detection datasets, or natural language understanding datasets) before using them as utterance encoders for training an intent classifier. In this paper, we show that continual pre-training may not be essential, since the overfitting problem of PLMs on this task may not be as serious as expected. Specifically, we find that directly fine-tuning PLMs on only a handful of labeled examples already yields decent results compared to methods that employ continual pre-training, and the performance gap diminishes rapidly as the number of labeled data increases. To maximize the utilization of the limited available data, we propose a context augmentation method and leverage sequential self-distillation to boost performance. Comprehensive experiments on real-world benchmarks show that given only two or more labeled samples per class, direct fine-tuning outperforms many strong baselines that utilize external data sources for continual pre-training. The code can be found at https://github.com/hdzhang-code/DFTPlus.
引用
收藏
页码:11105 / 11119
页数:15
相关论文
共 50 条
  • [41] Fine-Tuning of CLIP in Few-Shot Scenarios via Supervised Contrastive Learning
    Luo, Jing
    Wu, Guangxing
    Liu, Hongmei
    Wang, Ruixuan
    PATTERN RECOGNITION AND COMPUTER VISION, PT III, PRCV 2024, 2025, 15033 : 104 - 117
  • [42] Omni-Training: Bridging Pre-Training and Meta-Training for Few-Shot Learning
    Shu, Yang
    Cao, Zhangjie
    Gao, Jinghan
    Wang, Jianmin
    Yu, Philip S.
    Long, Mingsheng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 15275 - 15291
  • [43] Tri-Train: Automatic Pre-Fine Tuning between Pre-Training and Fine-Tuning for SciNER
    Zeng, Qingkai
    Yu, Wenhao
    Yu, Mengxia
    Jiang, Tianwen
    Weninger, Tim
    Jiang, Meng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 4778 - 4787
  • [44] Chain of Thought Guided Few-Shot Fine-Tuning of LLMs for Multimodal Aspect-Based Sentiment Classification
    Wu, Hao
    Yang, Danping
    Liu, Peng
    Li, Xianxian
    MULTIMEDIA MODELING, MMM 2025, PT I, 2025, 15520 : 182 - 194
  • [45] Bridging the Gap between Pre-Training and Fine-Tuning for Commonsense Generation
    Yang, Haoran
    Wang, Yan
    Li, Piji
    Bi, Wei
    Lam, Wai
    Xu, Chen
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 376 - 383
  • [46] On the Connection between Pre-training Data Diversity and Fine-tuning Robustness
    Ramanujan, Vivek
    Nguyen, Thao
    Oh, Sewoong
    Schmidt, Ludwig
    Farhadi, Ali
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [47] Partial Is Better Than All: Revisiting Fine-tuning Strategy for Few-shot Learning
    Shen, Zhiqiang
    Liu, Zechun
    Qin, Jie
    Savvides, Marios
    Cheng, Kwang-Ting
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9594 - 9602
  • [48] Data race detection via few-shot parameter-efficient fine-tuning
    Shen, Yuanyuan
    Peng, Manman
    Zhang, Fan
    Wu, Qiang
    JOURNAL OF SYSTEMS AND SOFTWARE, 2025, 222
  • [49] A fine-tuning prototypical network for few-shot cross-domain fault diagnosis
    Zhong, Jianhua
    Gu, Kairong
    Jiang, Haifeng
    Liang, Wei
    Zhong, Shuncong
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2024, 35 (11)
  • [50] Improving Zero and Few-Shot Abstractive Summarization with Intermediate Fine-tuning and Data Augmentation
    Fabbri, Alexander R.
    Han, Simeng
    Li, Haoyuan
    Li, Haoran
    Ghazvininejad, Marjan
    Joty, Shafiq
    Radev, Dragomir
    Mehdad, Yashar
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 704 - 717