Revisit Few-shot Intent Classification with PLMs: Direct Fine-tuning vs. Continual Pre-training

被引:0
|
作者
Zhang, Haode [1 ]
Liang, Haowen [1 ]
Zh, Liming [1 ]
Lam, Albert Y. S. [2 ]
Wu, Xiao-Ming [1 ]
机构
[1] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China
[2] Fano Labs, Hong Kong, Peoples R China
关键词
D O I
暂无
中图分类号
学科分类号
摘要
We consider the task of few-shot intent detection, which involves training a deep learning model to classify utterances based on their underlying intents using only a small amount of labeled data. The current approach to address this problem is through continual pre-training, i.e., fine-tuning pre-trained language models (PLMs) on external resources (e.g., conversational corpora, public intent detection datasets, or natural language understanding datasets) before using them as utterance encoders for training an intent classifier. In this paper, we show that continual pre-training may not be essential, since the overfitting problem of PLMs on this task may not be as serious as expected. Specifically, we find that directly fine-tuning PLMs on only a handful of labeled examples already yields decent results compared to methods that employ continual pre-training, and the performance gap diminishes rapidly as the number of labeled data increases. To maximize the utilization of the limited available data, we propose a context augmentation method and leverage sequential self-distillation to boost performance. Comprehensive experiments on real-world benchmarks show that given only two or more labeled samples per class, direct fine-tuning outperforms many strong baselines that utilize external data sources for continual pre-training. The code can be found at https://github.com/hdzhang-code/DFTPlus.
引用
收藏
页码:11105 / 11119
页数:15
相关论文
共 50 条
  • [31] AlignDet: Aligning Pre-training and Fine-tuning in Object Detection
    Li, Ming
    Wu, Jie
    Wang, Xionghui
    Chen, Chen
    Qin, Jie
    Xiao, Xuefeng
    Wang, Rui
    Zheng, Min
    Pan, Xin
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 6843 - 6853
  • [32] Improved Fine-Tuning by Better Leveraging Pre-Training Data
    Liu, Ziquan
    Xu, Yi
    Xu, Yuanhong
    Qian, Qi
    Li, Hao
    Ji, Xiangyang
    Chan, Antoni B.
    Jin, Rong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [33] Anchoring Fine-tuning of Sentence Transformer with Semantic Label Information for Efficient Truly Few-shot Classification
    Pauli, Amalie Brogaard
    Derczynski, Leon
    Assent, Ira
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 11254 - 11264
  • [34] RIFF: Learning to Rephrase Inputs for Few-shot Fine-tuning of Language Models
    Najafi, Saeed
    Fyshe, Alona
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 1447 - 1466
  • [35] Open-Set Face Identification on Few-Shot Gallery by Fine-Tuning
    Park, Hojin
    Park, Jaewoo
    Teoh, Andrew Beng Jin
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 1026 - 1032
  • [36] Network Pruning and Fine-tuning for Few-shot Industrial Image Anomaly Detection
    Zhang, Jie
    Suganuma, Masanori
    Okatani, Takayuki
    2023 IEEE 21ST INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS, INDIN, 2023,
  • [37] COFT-AD: COntrastive Fine-Tuning for Few-Shot Anomaly Detection
    Liao, Jingyi
    Xu, Xun
    Nguyen, Manh Cuong
    Goodge, Adam
    Foo, Chuan Sheng
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 2090 - 2103
  • [38] Incremental few-shot instance segmentation without fine-tuning on novel classes
    Zhang, Luofeng
    Weng, Libo
    Zhang, Yuanming
    Gao, Fei
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2025, 254
  • [39] Incremental Few-Shot Object Detection via Simple Fine-Tuning Approach
    Choi, Tae-Min
    Kim, Jong-Hwan
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2023), 2023, : 9289 - 9295
  • [40] Exploring Few-Shot Fine-Tuning Strategies for Models of Visually Grounded Speech
    Miller, Tyler
    Harwath, David
    INTERSPEECH 2022, 2022, : 1416 - 1420