Adversarial Learning of Task-Oriented Neural Dialog Models

被引:0
|
作者
Liu, Bing [1 ]
Lane, Ian [2 ]
机构
[1] Carnegie Mellon Univ, Elect & Comp Engn, Pittsburgh, PA 15213 USA
[2] Carnegie Mellon Univ, Elect & Comp Engn, Language Technol Inst, Pittsburgh, PA 15213 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we propose an adversarial learning method for reward estimation in reinforcement learning (RL) based task-oriented dialog models. Most of the current RL based task-oriented dialog systems require the access to a reward signal from either user feedback or user ratings. Such user ratings, however, may not always be consistent or available in practice. Furthermore, online dialog policy learning with RL typically requires a large number of queries to users, suffering from sample efficiency problem. To address these challenges, we propose an adversarial learning method to learn dialog rewards directly from dialog samples. Such rewards are further used to optimize the dialog policy with policy gradient based RL. In the evaluation in a restaurant search domain, we show that the proposed adversarial dialog learning method achieves advanced dialog success rate comparing to strong baseline methods. We further discuss the covariate shift problem in online adversarial dialog learning and show how we can address that with partial access to user feedback.
引用
收藏
页码:350 / 359
页数:10
相关论文
共 50 条
  • [31] Exploring Auxiliary Reasoning Tasks for Task-oriented Dialog Systems with Meta Cooperative Learning
    Qin, Bowen
    Yang, Min
    Bing, Lidong
    Jiang, Qingshan
    Li, Chengming
    Xu, Ruifeng
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 13701 - 13708
  • [32] CINS: Comprehensive Instruction for Few-Shot Learning in Task-Oriented Dialog Systems
    Mi, Fei
    Wang, Yasheng
    Li, Yitong
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 11076 - 11084
  • [33] Discriminative Transfer Learning for Optimizing ASR and Semantic Labeling in Task-oriented Spoken Dialog
    Qian, Yao
    Shi, Yu
    Zeng, Michael
    INTERSPEECH 2020, 2020, : 3915 - 3919
  • [34] An End-to-End Trainable Neural Network Model with Belief Tracking for Task-Oriented Dialog
    Liu, Bing
    Lane, Ian
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2506 - 2510
  • [35] A Working Memory Model for Task-oriented Dialog Response Generation
    Chen, Xiuyi
    Xu, Jiaming
    Xu, Bo
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 2687 - 2693
  • [36] Database Search Results Disambiguation for Task-Oriented Dialog Systems
    Qian, Kun
    Kottur, Satwik
    Beirami, Ahmad
    Shayandeh, Shahin
    Crook, Paul
    Geramifard, Alborz
    Yu, Zhou
    Sankar, Chinnadhurai
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 1158 - 1173
  • [37] Generative Encoder-Decoder Models for Task-Oriented Spoken Dialog Systems with Chatting Capability
    Zhao, Tiancheng
    Lu, Allen
    Lee, Kyusong
    Eskenazi, Maxine
    18TH ANNUAL MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE (SIGDIAL 2017), 2017, : 27 - 36
  • [38] Learning to Model Task-Oriented Attention
    Zou, Xiaochun
    Zhao, Xinbo
    Wang, Jian
    Yang, Yongjia
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2016, 2016 : 1 - 12
  • [39] Unified Dialog Model Pre-training for Task-Oriented Dialog Understanding and Generation
    He, Wanwei
    Dai, Yinpei
    Yang, Min
    Sun, Jian
    Huang, Fei
    Si, Luo
    Li, Yongbin
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 187 - 200
  • [40] CONTEXT-AWARE DIALOG RE-RANKING FOR TASK-ORIENTED DIALOG SYSTEMS
    Ohmura, Junki
    Eskenazi, Maxine
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 846 - 853