Revisit Few-shot Intent Classification with PLMs: Direct Fine-tuning vs. Continual Pre-training

被引：0

作者：

Zhang, Haode ^{[1
]}

Liang, Haowen ^{[1
]}

Zh, Liming ^{[1
]}

Lam, Albert Y. S. ^{[2
]}

Wu, Xiao-Ming ^{[1
]}

机构：

[1] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China

[2] Fano Labs, Hong Kong, Peoples R China

来源：

FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023) | 2023年

关键词：

D O I：

暂无

中图分类号：

学科分类号：

摘要：

We consider the task of few-shot intent detection, which involves training a deep learning model to classify utterances based on their underlying intents using only a small amount of labeled data. The current approach to address this problem is through continual pre-training, i.e., fine-tuning pre-trained language models (PLMs) on external resources (e.g., conversational corpora, public intent detection datasets, or natural language understanding datasets) before using them as utterance encoders for training an intent classifier. In this paper, we show that continual pre-training may not be essential, since the overfitting problem of PLMs on this task may not be as serious as expected. Specifically, we find that directly fine-tuning PLMs on only a handful of labeled examples already yields decent results compared to methods that employ continual pre-training, and the performance gap diminishes rapidly as the number of labeled data increases. To maximize the utilization of the limited available data, we propose a context augmentation method and leverage sequential self-distillation to boost performance. Comprehensive experiments on real-world benchmarks show that given only two or more labeled samples per class, direct fine-tuning outperforms many strong baselines that utilize external data sources for continual pre-training. The code can be found at https://github.com/hdzhang-code/DFTPlus.

引用

页码：11105 / 11119

页数：15

共 50 条

[21] Strong Baselines for Parameter-Efficient Few-Shot Fine-Tuning
Basu, Samyadeep
Hu, Shell
Massiceti, Daniela
Feizi, Soheil
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 10, 2024, : 11024 - 11031
[22] AirDet: Few-Shot Detection without Fine-tuning for Autonomous Exploration
Li, Bowen
Wang, Chen
Reddy, Pranay
Kim, Seungchan
Scherer, Sebastian
arXiv, 2021,
[23] Enhancing Few-Shot CLIP With Semantic-Aware Fine-Tuning
Zhu, Yao
Chen, Yuefeng
Mao, Xiaofeng
Yan, Xiu
Wang, Yue
Lu, Wang
Wang, Jindong
Ji, Xiangyang
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
[24] Few-Shot Fine-Tuning SOTA Summarization Models for Medical Dialogues
Navarro, David Fraile
Dras, Mark
Berkovsky, Shlomo
NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2022, : 254 - 266
[25] Few-shot fine-tuning with auxiliary tasks for video anomaly detection
Lv, Jing
Liu, Zhi
Li, Gongyang
MULTIMEDIA SYSTEMS, 2025, 31 (02)
[26] CODE: Contrastive Pre-training with Adversarial Fine-Tuning for Zero-Shot Expert Linking
Chen, Bo
Zhang, Jing
Zhang, Xiaokang
Tang, Xiaobin
Cai, Lingfan
Chen, Hong
Li, Cuiping
Zhang, Peng
Tang, Jie
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 11846 - 11854
[27] The Devil is in the Details: On Models and Training Regimes for Few-Shot Intent Classification
Mesgar, Mohsen
Thy Thy Tran
Glavas, Goran
Gurevych, Iryna
17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1846 - 1857
[28] Multitask Pre-training of Modular Prompt for Chinese Few-Shot Learning
Sun, Tianxiang
He, Zhengfu
Zhu, Qin
Qiu, Xipeng
Huang, Xuanjing
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 11156 - 11172
[29] SAR-HUB: Pre-Training, Fine-Tuning, and Explaining
Yang, Haodong
Kang, Xinyue
Liu, Long
Liu, Yujiang
Huang, Zhongling
REMOTE SENSING, 2023, 15 (23)
[30] Synergistic Anchored Contrastive Pre-training for Few-Shot Relation Extraction
Luo, Da
Gan, Yanglei
Hou, Rui
Lin, Run
Liu, Qiao
Cai, Yuxiang
Gao, Wannian
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 17, 2024, : 18742 - 18750

← 1 2 3 4 5 →