AgentGraph: Toward Universal Dialogue Management With Structured Deep Reinforcement Learning

被引:26
|
作者
Chen, Lu [1 ]
Chen, Zhi [1 ]
Tan, Bowen [1 ]
Long, Sishan [1 ]
Gasic, Milica [2 ]
Yu, Kai [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China
[2] Heinrich Heine Univ Dusseldorf, D-40225 Dusseldorf, Germany
关键词
Dialogue policy; deep reinforcement learning; graph neural networks; policy adaptation; transfer learning; STATE; SYSTEMS;
D O I
10.1109/TASLP.2019.2919872
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Dialogue policy plays an important role in task-oriented spoken dialogue systems. It determines how to respond to users. The recently proposed deep reinforcement learning (DRL) approaches have been used for policy optimization. However, these deep models are still challenging for two reasons: first, many DRL-based policies are not sample efficient; and second, most models do not have the capability of policy transfer between different domains. In this paper, we propose a universal framework, AgentGraph, to tackle these two problems. The proposed AgentGraph is the combination of graph neural network (GNN) based architecture and DRL-based algorithm. It can be regarded as one of the multi-agent reinforcement learning approaches. Each agent corresponds to a node in a graph, which is defined according to the dialogue domain ontology. When making a decision, each agent can communicate with its neighbors on the graph. Under AgentGraph framework, we further propose dual GNN-based dialogue policy, which implicitly decomposes the decision in each turn into a high-level global decision and a low-level local decision. Experiments show that AgentGraph models significantly outperform traditional reinforcement learning approaches on most of the 18 tasks of the PyDial benchmark. Moreover, when transferred from the source task to a target task, these models not only have acceptable initial performance but also converge much faster on the target task.
引用
收藏
页码:1378 / 1391
页数:14
相关论文
共 50 条
  • [1] Distributed Structured Actor-Critic Reinforcement Learning for Universal Dialogue Management
    Chen, Zhi
    Chen, Lu
    Liu, Xiaoyuan
    Yu, Kai
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 2400 - 2411
  • [2] Optimizing Policy via Deep Reinforcement Learning for Dialogue Management
    Xu, Guanghao
    Lee, Hyunjung
    Koo, Myoung-Wan
    Seo, Jungyun
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2018, : 582 - 589
  • [3] POLICY ADAPTATION FOR DEEP REINFORCEMENT LEARNING-BASED DIALOGUE MANAGEMENT
    Chen, Lu
    Chang, Cheng
    Chen, Zhi
    Tan, Bowen
    Gasic, Milica
    Yu, Kai
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6074 - 6078
  • [4] Reinforcement Learning for Personalized Dialogue Management
    den Hengst, Floris
    Hoogendoorn, Mark
    van Harmelen, Frank
    Bosman, Joost
    2019 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2019), 2019, : 59 - 67
  • [5] Structured Control Nets for Deep Reinforcement Learning
    Srouji, Mario
    Zhang, Jian
    Salakhutdinov, Ruslan
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [6] CURIOSITY-DRIVEN REINFORCEMENT LEARNING FOR DIALOGUE MANAGEMENT
    Wesselmann, Paula
    Wu, Yen-Chen
    Gasic, Milica
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 7210 - 7214
  • [7] A Comprehensive Reinforcement Learning Framework for Dialogue Management Optimization
    Daubigney, Lucie
    Geist, Matthieu
    Chandramohan, Senthilkumar
    Pietquin, Olivier
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2012, 6 (08) : 891 - 902
  • [8] Universal quantum control through deep reinforcement learning
    Murphy Yuezhen Niu
    Sergio Boixo
    Vadim N. Smelyanskiy
    Hartmut Neven
    npj Quantum Information, 5
  • [9] Universal quantum control through deep reinforcement learning
    Niu, Murphy Yuezhen
    Boixo, Sergio
    Smelyanskiy, Vadim N.
    Neven, Hartmut
    NPJ QUANTUM INFORMATION, 2019, 5 (1)
  • [10] Deep Reinforcement Learning of Dialogue Policies with Less Weight Updates
    Cuayahuitl, Heriberto
    Yu, Seunghak
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2511 - 2515