Learning Goal-oriented Dialogue Policy with Opposite Agent Awareness

被引:0
|
作者
Zhang, Zheng [1 ]
Liao, Lizi [2 ]
Zhu, Xiaoyan [1 ]
Chua, Tat-Seng [2 ]
Liu, Zitao [3 ]
Huang, Yan [3 ]
Huang, Minlie [1 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, State Key Lab Intelligent Technol & Syst, Inst Artificial Intelligence,Beijing Natl Res Ctr, Beijing, Peoples R China
[2] Natl Univ Singapore, Sch Comp, Singapore, Singapore
[3] TAL Educ Grp, Beijing, Peoples R China
关键词
REINFORCEMENT; NETWORKS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most existing approaches for goal-oriented dialogue policy learning used reinforcement learning, which focuses on the target agent policy and simply treats the opposite agent policy as part of the environment. While in real-world scenarios, the behavior of an opposite agent often exhibits certain patterns or underlies hidden policies, which can be inferred and utilized by the target agent to facilitate its own decision making. This strategy is common in human mental simulation by first imaging a specific action and the probable results before really acting it. We therefore propose an opposite behavior aware framework for policy learning in goal-oriented dialogues. We estimate the opposite agent's policy from its behavior and use this estimation to improve the target agent by regarding it as part of the target policy. We evaluate our model on both cooperative and competitive dialogue tasks, showing superior performance over state-of-the-art baselines.
引用
收藏
页码:122 / 132
页数:11
相关论文
共 50 条
  • [1] Goal-Oriented Dialogue Policy Learning from Failures
    Lu, Keting
    Zhang, Shiqi
    Chen, Xiaoping
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 2596 - 2603
  • [2] AirDialogue: An Environment for Goal-Oriented Dialogue Research
    Wei, Wei
    Le, Quoc, V
    Dai, Andrew M.
    Li, Li-Jia
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3844 - 3854
  • [3] Goal-Oriented Agent Testing Revisited
    Ekinci, Erdem Eser
    Tiryaki, Ali Murat
    Cetin, Oevuenc
    Dikenelli, Oguz
    [J]. AGENT-ORIENTED SOFTWARE ENGINEERING IX, 2009, 5386 : 173 - 186
  • [4] Domain Expert Platform for Goal-Oriented Dialogue Collection
    Gosko, Didzis
    Znotins, Arturs
    Skadina, Inguna
    Gruzitis, Normunds
    Nespore-Berzkalne, Gunta
    [J]. EACL 2021: THE 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: PROCEEDINGS OF THE SYSTEM DEMONSTRATIONS, 2021, : 295 - 301
  • [5] LEARNING GOAL-ORIENTED VISUAL DIALOG VIA TEMPERED POLICY GRADIENT
    Zhao, Rui
    Tresp, Volker
    [J]. 2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 868 - 875
  • [6] Goal-oriented methodology for agent system development
    Shen, ZQ
    Miao, CY
    Gay, R
    Li, DT
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (04): : 1413 - 1420
  • [7] Towards goal-oriented design of agent systems
    Khallouf, J
    Winikoff, M
    [J]. QSIC 2005: FIFTH INTERNATIONAL CONFERENCE ON QUALITY SOFTWARE, PROCEEDINGS, 2005, : 389 - 394
  • [8] Hermes: Designing goal-oriented agent interactions
    Cheong, Christopher
    Winikoff, Michael
    [J]. AGENT-ORIENTED SOFTWARE ENGINEERING VI, 2006, 3950 : 16 - 27
  • [9] Goal-oriented methodology for agent system development
    Shen, ZQ
    Li, DT
    Miao, CY
    Gay, R
    Miao, Y
    [J]. 2005 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON INTELLIGENT AGENT TECHNOLOGY, PROCEEDINGS, 2005, : 95 - 101
  • [10] Hermes: Implementing goal-oriented agent interactions
    Cheong, C
    Winikoff, M
    [J]. PROGRAMMING MULTI-AGENT SYSTEMS, 2006, 3862 : 168 - 183