A Survey on Recent Advances and Challenges in Reinforcement Learning Methods for Task-oriented Dialogue Policy Learning

被引:0
|
作者
Wai-Chung Kwan
Hong-Ru Wang
Hui-Min Wang
Kam-Fai Wong
机构
[1] The Chinese University of Hong Kong,The Systems Engineering and Engineering Management Department
来源
关键词
Dialogue policy learning (DPL); task-oriented dialogue system (TOD); reinforcement learning (RL); dialogue system; Markov decision process;
D O I
暂无
中图分类号
学科分类号
摘要
Dialogue policy learning (DPL) is a key component in a task-oriented dialogue (TOD) system. Its goal is to decide the next action of the dialogue system, given the dialogue state at each turn based on a learned dialogue policy. Reinforcement learning (RL) is widely used to optimize this dialogue policy. In the learning process, the user is regarded as the environment and the system as the agent. In this paper, we present an overview of the recent advances and challenges in dialogue policy from the perspective of RL. More specifically, we identify the problems and summarize corresponding solutions for RL-based dialogue policy learning. In addition, we provide a comprehensive survey of applying RL to DPL by categorizing recent methods into five basic elements in RL. We believe this survey can shed light on future research in DPL.
引用
收藏
页码:318 / 334
页数:16
相关论文
共 50 条
  • [1] A Survey on Recent Advances and Challenges in Reinforcement Learning Methods for Task-oriented Dialogue Policy Learning
    Kwan, Wai-Chung
    Wang, Hong-Ru
    Wang, Hui-Min
    Wong, Kam-Fai
    [J]. MACHINE INTELLIGENCE RESEARCH, 2023, 20 (03) : 318 - 334
  • [2] A Survey of Task-Oriented Dialogue Policies Based on Reinforcement Learning
    Xu, Kai
    Wang, Zhen-Yu
    Wang, Xu
    Qin, Hua
    Long, Yu-Xuan
    [J]. Jisuanji Xuebao/Chinese Journal of Computers, 2024, 47 (06): : 1201 - 1231
  • [3] Task-oriented Dialogue System Based on Reinforcement Learning
    Song, Meina
    Chen, Zhongfu
    Niu, Peiqing
    Haihong, E.
    [J]. PROCEEDINGS OF 2019 IEEE 10TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS 2019), 2019, : 93 - 98
  • [4] Rethinking Supervised Learning and Reinforcement Learning in Task-Oriented Dialogue Systems
    Li, Ziming
    Kiseleva, Julia
    de Rijke, Maarten
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020,
  • [5] Budgeted Policy Learning for Task-Oriented Dialogue Systems
    Zhang, Zhirui
    Li, Xiujun
    Gao, Jianfeng
    Chen, Enhong
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3742 - 3751
  • [6] Transfer Learning based Task-oriented Dialogue Policy for Multiple Domains using Hierarchical Reinforcement Learning
    Saha, Tulika
    Saha, Sriparna
    Bhattacharyya, Pushpak
    [J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [7] Domain Complexity and Policy Learning in Task-Oriented Dialogue Systems
    Papangelis, Alexandros
    Ultes, Stefan
    Stylianou, Yannis
    [J]. ADVANCED SOCIAL INTERACTION WITH AGENTS, 2019, 510 : 63 - 69
  • [8] CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement Learning
    Verma, Siddharth
    Fu, Justin
    Yang, Mengjiao
    Levine, Sergey
    [J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 4471 - 4491
  • [9] Advances and Challenges in Multi-Domain Task-Oriented Dialogue Policy Optimization
    Rohmatillah, Mahdin
    Chien, Jen-Tzung
    [J]. APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2023, 12 (01)
  • [10] Recent Neural Methods on Dialogue State Tracking for Task-Oriented Dialogue Systems: A Survey
    Balaraman, Vevake
    Sheikhalishahi, Seyedmostafa
    Magnini, Bernardo
    [J]. SIGDIAL 2021: 22ND ANNUAL MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE (SIGDIAL 2021), 2021, : 239 - 251