Task-based dialogue policy learning based on diffusion models

被引：0

作者：

Liu, Zhibin ^{[1
]}

Pang, Rucai ^{[1
]}

Dong, Zhaoan ^{[1
]}

机构：

[1] Qufu Normal Univ, Sch Comp Sci, Yantai Rd, Rizhao 276826, Shandong, Peoples R China

来源：

APPLIED INTELLIGENCE | 2024年 / 54卷 / 22期

基金：

中国国家自然科学基金;

关键词：

Multi-domain dialogue; Reinforcement learning; Reward estimation; Behavioural cloning; Diffusion models;

D O I：

10.1007/s10489-024-05810-6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The purpose of task-based dialogue systems is to help users achieve their dialogue needs using as few dialogue rounds as possible. As the demand increases, the dialogue tasks gradually involve multiple domains and develop in the direction of complexity and diversity. Achieving high performance with low computational effort has become an essential metric for multi-domain task-based dialogue systems. This paper proposes a new approach to guided dialogue policy. The method introduces a conditional diffusion model in the reinforcement learning Q-learning algorithm to regularise the policy in a diffusion Q-learning manner. The conditional diffusion model is used to learn the action value function, regulate the actions using regularisation, sample the actions, use the sampled actions in the policy update process, and additionally add a loss term that maximizes the value of the actions in the policy update process to improve the learning efficiency. Our proposed method is based on a conditional diffusion model, combined with the reinforcement learning TD3 algorithm as a dialogue policy and an inverse reinforcement learning approach to construct a reward estimator to provide rewards for policy updates as a way of completing a multi-domain dialogue task.

引用

页码：11752 / 11764

页数：13

共 50 条

[21] An Application of Task-based Learning in Teaching Practice
Zhu, Yan
Zhang, Xiaofang
2013 INTERNATIONAL CONFERENCE ON SOCIAL SCIENCES RESEARCH (SSR 2013), PT 1, 2013, 1 : 58 - 65
[22] Task-Based Neuromodulation Architecture for Lifelong Learning
Daram, Anurag Reddy
Kudithipudi, Dhireesha
Yanguas-Gil, Angel
PROCEEDINGS OF THE 2019 20TH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED), 2019, : 191 - 197
[23] Task-based language learning and teaching.
Mackey, A
STUDIES IN SECOND LANGUAGE ACQUISITION, 2004, 26 (03) : 480 - 482
[24] ForeignLanguage Teaching Methods in Task-Based Learning
Shukurova, Farida
ARAB WORLD ENGLISH JOURNAL, 2024, 15 (01) : 44 - 55
[25] Task-based learning environments in a virtual university
Whittington, D
Campbell, L
COMPUTER NETWORKS AND ISDN SYSTEMS, 1998, 30 (1-7): : 707 - 709
[26] Task-based language learning and teaching with technology
Dettori, Giuliana
BRITISH JOURNAL OF EDUCATIONAL TECHNOLOGY, 2011, 42 (05) : E114 - E115
[27] Task-Based Language Learning and Teaching with Technology
Ziegler, Nicole
Rock, Kristin
KOREAN LANGUAGE IN AMERICA, 2016, 20 (01): : 95 - 97
[28] Dynamic Guidance for Task-Based Exploratory Learning
Thomas, James M.
Young, R. Michael
ARTIFICIAL INTELLIGENCE IN EDUCATION, 2011, 6738 : 369 - 376
[29] Task-Based Neuromodulation Architecture for Lifelong Learning
Daram, Anurag Reddy
Kudithipudi, Dhireesha
Yanguas-Gil, Angel
2018 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2018,
[30] Task-based language learning and teaching.
Byrnes, H
MODERN LANGUAGE JOURNAL, 2005, 89 (02): : 297 - 298

← 1 2 3 4 5 →