Task-based dialogue policy learning based on diffusion models

被引:0
|
作者
Liu, Zhibin [1 ]
Pang, Rucai [1 ]
Dong, Zhaoan [1 ]
机构
[1] Qufu Normal Univ, Sch Comp Sci, Yantai Rd, Rizhao 276826, Shandong, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-domain dialogue; Reinforcement learning; Reward estimation; Behavioural cloning; Diffusion models;
D O I
10.1007/s10489-024-05810-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The purpose of task-based dialogue systems is to help users achieve their dialogue needs using as few dialogue rounds as possible. As the demand increases, the dialogue tasks gradually involve multiple domains and develop in the direction of complexity and diversity. Achieving high performance with low computational effort has become an essential metric for multi-domain task-based dialogue systems. This paper proposes a new approach to guided dialogue policy. The method introduces a conditional diffusion model in the reinforcement learning Q-learning algorithm to regularise the policy in a diffusion Q-learning manner. The conditional diffusion model is used to learn the action value function, regulate the actions using regularisation, sample the actions, use the sampled actions in the policy update process, and additionally add a loss term that maximizes the value of the actions in the policy update process to improve the learning efficiency. Our proposed method is based on a conditional diffusion model, combined with the reinforcement learning TD3 algorithm as a dialogue policy and an inverse reinforcement learning approach to construct a reward estimator to provide rewards for policy updates as a way of completing a multi-domain dialogue task.
引用
收藏
页码:11752 / 11764
页数:13
相关论文
共 50 条
  • [41] Classroom learning, teaching, and research: A task-based perspective
    Pica, T
    MODERN LANGUAGE JOURNAL, 2005, 89 (03): : 339 - 352
  • [42] TASK-BASED EXPLANATIONS
    TANNER, MC
    EXPERT SYSTEMS WITH APPLICATIONS, 1995, 8 (04) : 505 - 512
  • [43] On Task-based Syllabus
    毕会英
    张琦
    海外英语, 2011, (10) : 174 - 175
  • [44] The Application of Task-based Language Learning in English Teaching
    蔡超
    海外英语, 2015, (22) : 217 - 218
  • [45] Incorporating task-based learning in an extensive reading programme
    Chen, I-Chen
    ELT JOURNAL, 2018, 72 (04) : 405 - 414
  • [46] Task-based Teaching and Learning in English Listening Class
    鲍蓉芳
    科技信息, 2008, (17) : 562 - 563
  • [47] End-to-end multi-task optimization model for task-based dialogue systems
    Zhao F.
    Qiu M.
    Li X.
    Sun Y.
    Yang Z.
    Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS, 2023, 29 (11): : 3592 - 3599
  • [48] Recent Perspectives on Task-Based Language Learning and Teaching
    Bryfonski, Lara
    ELT JOURNAL, 2020, 74 (04) : 492 - 511
  • [49] NetGist: Learning to generate task-based network summaries
    Amiri, Sorour E.
    Adhikari, Bijaya
    Bharadwaj, Aditya
    Prakash, B. Aditya
    2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 857 - 862
  • [50] The French Kitchen: Task-Based Learning in an Instrumented Kitchen
    Hooper, Clare J.
    Preston, Anne
    Balaam, Madeline
    Seedhouse, Paul
    Jackson, Daniel
    Cuong Pham
    Ladha, Cassim
    Ladha, Karim
    Plotz, Thomas
    Olivier, Patrick
    UBICOMP'12: PROCEEDINGS OF THE 2012 ACM INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING, 2012, : 193 - 202