Self-training Improves Pre-training for Few-shot Learning in Task-oriented Dialog Systems

被引:0
|
作者
Mi, Fei [1 ]
Zhou, Wanhao [2 ]
Cai, Fengyu [2 ]
Kong, Lingjing [2 ]
Huang, Minlie [3 ]
Faltings, Boi [2 ]
机构
[1] Huawei Noahs Ark Lab, Hong Kong, Peoples R China
[2] Ecole Polytech Fed Lausanne, LIA, Lausanne, Switzerland
[3] Tsinghua Univ, CoAI, DCST, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As the labeling cost for different modules in task-oriented dialog (ToD) systems is expensive, a major challenge is to train different modules with the least amount of labeled data. Recently, large-scale pre-trained language models, have shown promising results for few-shot learning in ToD. In this paper, we devise a self-training approach to utilize the abundant unlabeled dialog data to further improve state-of-the-art pre-trained models in few-shot learning scenarios for ToD systems. Specifically, we propose a self-training approach that iteratively labels the most confident unlabeled data to train a stronger Student model. Moreover, a new text augmentation technique (GradAug) is proposed to better train the Student by replacing non-crucial tokens using a masked language model. We conduct extensive experiments and present analyses on four downstream tasks in ToD, including intent classification, dialog state tracking, dialog act prediction, and response selection. Empirical results demonstrate that the proposed self-training approach consistently improves state-of-the-art pre-trained models (BERT, ToD-BERT) when only a small number of labeled data are available.
引用
收藏
页码:1887 / 1898
页数:12
相关论文
共 50 条
  • [41] Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System
    Su, Yixuan
    Shu, Lei
    Mansimov, Elman
    Gupta, Arshit
    Cai, Deng
    Lai, Yi-An
    Zhang, Yi
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 4661 - 4676
  • [42] LOGEN: Few-Shot Logical Knowledge-Conditioned Text Generation With Self-Training
    Deng, Shumin
    Yang, Jiacheng
    Ye, Hongbin
    Tan, Chuanqi
    Chen, Mosha
    Huang, Songfang
    Huang, Fei
    Chen, Huajun
    Zhang, Ningyu
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 2124 - 2133
  • [43] Few-Shot Learning of Force-Based Motions From Demonstration Through Pre-training of Haptic Representation
    Aoyama, Marina Y.
    Moura, Joao
    Saito, Namiko
    Vijayakumar, Sethu
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2024), 2024, : 12839 - 12845
  • [44] Unified Multi-modal Pre-training for Few-shot Sentiment Analysis with Prompt-based Learning
    Yu, Yang
    Zhang, Dong
    Li, Shoushan
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022,
  • [45] Inversed Pyramid Network with Spatial-adapted and Task-oriented Tuning for few-shot learning
    Zhao, Xiaowei
    Wang, Duorui
    Bai, Shihao
    Wang, Shuo
    Gao, Yajun
    Liang, Yu
    Ma, Yuqing
    Liu, Xianglong
    PATTERN RECOGNITION, 2025, 164
  • [46] Few-Shot Learning and Self-Training for eNodeB Log Analysis for Service-Level Assurance in LTE Networks
    Aoki, Shogo
    Shiomoto, Kohei
    Eng, Chin Lam
    IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2020, 17 (04): : 2077 - 2089
  • [47] Self-paced Adversarial Training for Multimodal Few-shot Learning
    Pahde, Frederik
    Ostapenko, Oleksiy
    Jaehnichen, Patrick
    Klein, Tassilo
    Nabi, Moin
    2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, : 218 - 226
  • [48] A pre-training and self-training approach for biomedical named entity recognition
    Gao, Shang
    Kotevska, Olivera
    Sorokine, Alexandre
    Christian, J. Blair
    PLOS ONE, 2021, 16 (02):
  • [49] Improving Pre-Training and Fine-Tuning for Few-Shot SAR Automatic Target Recognition
    Zhang, Chao
    Dong, Hongbin
    Deng, Baosong
    REMOTE SENSING, 2023, 15 (06)
  • [50] Few-Shot Cross-Lingual Stance Detection with Sentiment-Based Pre-training
    Hardalov, Momchil
    Arora, Arnav
    Nakov, Preslav
    Augenstein, Isabelle
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 10729 - 10737