Self-training Improves Pre-training for Few-shot Learning in Task-oriented Dialog Systems

被引:0
|
作者
Mi, Fei [1 ]
Zhou, Wanhao [2 ]
Cai, Fengyu [2 ]
Kong, Lingjing [2 ]
Huang, Minlie [3 ]
Faltings, Boi [2 ]
机构
[1] Huawei Noahs Ark Lab, Hong Kong, Peoples R China
[2] Ecole Polytech Fed Lausanne, LIA, Lausanne, Switzerland
[3] Tsinghua Univ, CoAI, DCST, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As the labeling cost for different modules in task-oriented dialog (ToD) systems is expensive, a major challenge is to train different modules with the least amount of labeled data. Recently, large-scale pre-trained language models, have shown promising results for few-shot learning in ToD. In this paper, we devise a self-training approach to utilize the abundant unlabeled dialog data to further improve state-of-the-art pre-trained models in few-shot learning scenarios for ToD systems. Specifically, we propose a self-training approach that iteratively labels the most confident unlabeled data to train a stronger Student model. Moreover, a new text augmentation technique (GradAug) is proposed to better train the Student by replacing non-crucial tokens using a masked language model. We conduct extensive experiments and present analyses on four downstream tasks in ToD, including intent classification, dialog state tracking, dialog act prediction, and response selection. Empirical results demonstrate that the proposed self-training approach consistently improves state-of-the-art pre-trained models (BERT, ToD-BERT) when only a small number of labeled data are available.
引用
收藏
页码:1887 / 1898
页数:12
相关论文
共 50 条
  • [21] A Multi-Task Semantic Decomposition Framework with Task-specific Pre-training for Few-Shot NER
    Dong, Guanting
    Wang, Zechen
    Zhao, Jinxu
    Zhao, Gang
    Guo, Daichi
    Fu, Dayuan
    Hui, Tingfeng
    Zeng, Chen
    He, Keqing
    Li, Xuefeng
    Wang, Liwen
    Cui, Xinyue
    Xu, Weiran
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 430 - 440
  • [22] Modeling of Few-Shot Relation Extraction Based on Adaptive Self-Training
    Chen H.
    Zheng J.
    Cai F.
    Han Y.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2023, 60 (07): : 1581 - 1591
  • [23] Self-Training Based Few-Shot Node Classification by Knowledge Distillation
    Wu, Zongqian
    Mo, Yujie
    Zhou, Peng
    Yuan, Shangbo
    Zhu, Xiaofeng
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 14, 2024, : 15988 - 15995
  • [24] Uncertainty-aware Self-training for Few-shot Text Classification
    Mukherjee, Subhabrata
    Awadallah, Ahmed Hassan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [25] Modularized Pre-Training for End-to-End Task-Oriented Dialogue
    Qin, Libo
    Xu, Xiao
    Wang, Lehan
    Zhang, Yue
    Che, Wanxiang
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1601 - 1610
  • [26] A dynamic few-shot learning framework for medical image stream mining based on self-training
    Ye, Zhengqiang
    Zhang, Wei
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2023, 2023 (01)
  • [27] SELF-TRAINING AND PRE-TRAINING ARE COMPLEMENTARY FOR SPEECH RECOGNITION
    Xu, Qiantong
    Baevski, Alexei
    Likhomanenko, Tatiana
    Tomasello, Paden
    Conneau, Alexis
    Collobert, Ronan
    Synnaeve, Gabriel
    Auli, Michael
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3030 - 3034
  • [28] A dynamic few-shot learning framework for medical image stream mining based on self-training
    Zhengqiang Ye
    Wei Zhang
    EURASIP Journal on Advances in Signal Processing, 2023
  • [29] Task-adaptive Pre-training and Self-training are Complementary for Natural Language Understanding
    Li, Shiyang
    Yavuz, Semih
    Chen, Wenhu
    Yan, Xifeng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1006 - 1015
  • [30] Few-Shot Language Understanding Model for Task-Oriented Dialogues
    Xiang Z.
    Chen H.
    Wang Q.
    Li N.
    Data Analysis and Knowledge Discovery, 2023, 7 (09) : 64 - 77