Adversarial Learning of Task-Oriented Neural Dialog Models

被引:0
|
作者
Liu, Bing [1 ]
Lane, Ian [2 ]
机构
[1] Carnegie Mellon Univ, Elect & Comp Engn, Pittsburgh, PA 15213 USA
[2] Carnegie Mellon Univ, Elect & Comp Engn, Language Technol Inst, Pittsburgh, PA 15213 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we propose an adversarial learning method for reward estimation in reinforcement learning (RL) based task-oriented dialog models. Most of the current RL based task-oriented dialog systems require the access to a reward signal from either user feedback or user ratings. Such user ratings, however, may not always be consistent or available in practice. Furthermore, online dialog policy learning with RL typically requires a large number of queries to users, suffering from sample efficiency problem. To address these challenges, we propose an adversarial learning method to learn dialog rewards directly from dialog samples. Such rewards are further used to optimize the dialog policy with policy gradient based RL. In the evaluation in a restaurant search domain, we show that the proposed adversarial dialog learning method achieves advanced dialog success rate comparing to strong baseline methods. We further discuss the covariate shift problem in online adversarial dialog learning and show how we can address that with partial access to user feedback.
引用
收藏
页码:350 / 359
页数:10
相关论文
共 50 条
  • [1] ITERATIVE POLICY LEARNING IN END-TO-END TRAINABLE TASK-ORIENTED NEURAL DIALOG MODELS
    Liu, Bing
    Lane, Ian
    2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 482 - 489
  • [2] Incremental Dialog Processing in a Task-Oriented Dialog
    Ghigi, Fabrizio
    Eskenazi, Maxine
    Ines Torres, M.
    Lee, Sungjin
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 308 - 312
  • [3] A Two-Step Neural Dialog State Tracker for Task-Oriented Dialog Processing
    Kim, A-Yeong
    Song, Hyun-Je
    Park, Seong-Bae
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2018, 2018
  • [4] Scheduled Dialog Policy Learning: An Automatic Curriculum Learning Framework for Task-oriented Dialog System
    Liu, Sihong
    Zhang, Jinchao
    He, Keqing
    Xu, Weiran
    Zhou, Jie
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 1091 - 1102
  • [5] Adversarial Learning of Privacy-Preserving and Task-Oriented Representations
    Xiao, Taihong
    Tsai, Yi-Hsuan
    Sohn, Kihyuk
    Chandraker, Manmohan
    Yang, Ming-Hsuan
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 12434 - 12441
  • [6] Enhancing Troubleshooting Task-Oriented Dialog Systems with Large Language Models
    Zhou, Jiahao
    Zhang, Qiang
    Zhang, Fengda
    Yuan, Caixia
    INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2024, PT VI, 2025, 15206 : 328 - 338
  • [7] An empirical assessment of deep learning approaches to task-oriented dialog management
    Mateju, Lukas
    Griol, David
    Callejas, Zoraida
    Molina, Jose Manuel
    Sanchis, Araceli
    NEUROCOMPUTING, 2021, 439 : 327 - 339
  • [8] Multimodal Hierarchical Reinforcement Learning Policy for Task-Oriented Visual Dialog
    Zhang, Jiaping
    Zhao, Tiancheng
    Yu, Zhou
    19TH ANNUAL MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE (SIGDIAL 2018), 2018, : 140 - 150
  • [9] Continual Learning for Natural Language Generation in Task-oriented Dialog Systems
    Mi, Fei
    Chen, Liangwei
    Zhao, Mengjie
    Huang, Minlie
    Faltings, Boi
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020,
  • [10] ANYTOD: A Programmable Task-Oriented Dialog System
    Zhao, Jeffrey
    Cao, Yuan
    Gupta, Raghav
    Lee, Harrison
    Rastogi, Abhinav
    Wang, Mingqiu
    Soltau, Hagen
    Shafran, Izhak
    Wu, Yonghui
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 16189 - 16204