Adversarial Learning of Task-Oriented Neural Dialog Models

被引:0
|
作者
Liu, Bing [1 ]
Lane, Ian [2 ]
机构
[1] Carnegie Mellon Univ, Elect & Comp Engn, Pittsburgh, PA 15213 USA
[2] Carnegie Mellon Univ, Elect & Comp Engn, Language Technol Inst, Pittsburgh, PA 15213 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we propose an adversarial learning method for reward estimation in reinforcement learning (RL) based task-oriented dialog models. Most of the current RL based task-oriented dialog systems require the access to a reward signal from either user feedback or user ratings. Such user ratings, however, may not always be consistent or available in practice. Furthermore, online dialog policy learning with RL typically requires a large number of queries to users, suffering from sample efficiency problem. To address these challenges, we propose an adversarial learning method to learn dialog rewards directly from dialog samples. Such rewards are further used to optimize the dialog policy with policy gradient based RL. In the evaluation in a restaurant search domain, we show that the proposed adversarial dialog learning method achieves advanced dialog success rate comparing to strong baseline methods. We further discuss the covariate shift problem in online adversarial dialog learning and show how we can address that with partial access to user feedback.
引用
收藏
页码:350 / 359
页数:10
相关论文
共 50 条
  • [21] Novel Feature Discovery for Task-Oriented Dialog Systems
    Ho, Vinh Thinh
    Soliman, Mohamed
    Abujabal, Abdalghani
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 782 - 792
  • [22] Accelerating Natural Language Understanding in Task-Oriented Dialog
    Ahuja, Ojas
    Desai, Shrey
    NLP FOR CONVERSATIONAL AI, 2020, : 46 - 53
  • [23] Recent advances and challenges in task-oriented dialog systems
    ZHANG Zheng
    TAKANOBU Ryuichi
    ZHU Qi
    HUANG MinLie
    ZHU XiaoYan
    Science China(Technological Sciences), 2020, (10) : 2011 - 2027
  • [24] Polite Task-oriented Dialog Agents: To Generate or to Rewrite?
    Silva, Diogo
    Semedo, David
    Magalhaes, Joao
    PROCEEDINGS OF THE 12TH WORKSHOP ON COMPUTATIONAL APPROACHES TO SUBJECTIVITY, SENTIMENT & SOCIAL MEDIA ANALYSIS, 2022, : 304 - 314
  • [25] Recent advances and challenges in task-oriented dialog systems
    ZHANG Zheng
    TAKANOBU Ryuichi
    ZHU Qi
    HUANG MinLie
    ZHU XiaoYan
    Science China(Technological Sciences), 2020, 63 (10) : 2011 - 2027
  • [26] Task-Oriented Dialog Generation with Enhanced Entity Representation
    He, Zhenhao
    Wang, Jiachun
    Chen, Jian
    INTERSPEECH 2020, 2020, : 3905 - 3909
  • [27] Recent advances and challenges in task-oriented dialog systems
    Zhang, Zheng
    Takanobu, Ryuichi
    Zhu, Qi
    Huang, MinLie
    Zhu, XiaoYan
    SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2020, 63 (10) : 2011 - 2027
  • [28] Unsupervised Slot Schema Induction for Task-oriented Dialog
    Yu, Dian
    Wang, Mingqiu
    Cao, Yuan
    Shafran, Izhak
    El Shafey, Laurent
    Soltau, Hagen
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 1174 - 1193
  • [29] Task-Oriented Adversarial Attacks for Aspect-Based Sentiment Analysis Models
    Vazquez-Hernandez, Monserrat
    Algredo-Badillo, Ignacio
    Villasenor-Pineda, Luis
    Lobato-Baez, Mariana
    Lopez-Pimentel, Juan Carlos
    Morales-Rosales, Luis Alberto
    APPLIED SCIENCES-BASEL, 2025, 15 (02):
  • [30] Task-oriented learning on the Web
    Whittington, CD
    Campbell, LM
    INNOVATIONS IN EDUCATION AND TRAINING INTERNATIONAL, 1999, 36 (01): : 26 - 33