Budgeted Policy Learning for Task-Oriented Dialogue Systems

被引:0
|
作者
Zhang, Zhirui [1 ]
Li, Xiujun [2 ,3 ]
Gao, Jianfeng [2 ]
Chen, Enhong [1 ]
机构
[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China
[2] Microsoft Res AI, Redmond, WA USA
[3] Univ Washington, Seattle, WA 98195 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a new approach that extends Deep Dyna-Q (DDQ) by incorporating a Budget-Conscious Scheduling (BCS) to best utilize a fixed, small amount of user interactions (budget) for learning task-oriented dialogue agents. BCS consists of (1) a Poisson-based global scheduler to allocate budget over different stages of training; (2) a controller to decide at each training step whether the agent is trained using real or simulated experiences; (3) a user goal sampling module to generate the experiences that are most effective for policy learning. Experiments on a movie-ticket booking task with simulated and real users show that our approach leads to significant improvements in success rate over the state-of-the-art baselines given the fixed budget.
引用
收藏
页码:3742 / 3751
页数:10
相关论文
共 50 条
  • [31] High-Quality Diversification for Task-Oriented Dialogue Systems
    Tang, Zhiwen
    Kulkarni, Hrishikesh
    Yang, Grace Hui
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 1861 - 1872
  • [32] Metaphorical User Simulators for Evaluating Task-oriented Dialogue Systems
    Sun, Weiwei
    Guo, Shuyu
    Zhang, Shuo
    Ren, Pengjie
    Chen, Zhumin
    de Rijke, Maarten
    Ren, Zhaochun
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2024, 42 (01)
  • [33] Training Neural Response Selection for Task-Oriented Dialogue Systems
    Henderson, Matthew
    Vulic, Ivan
    Gerz, Daniela
    Casanueva, Inigo
    Budzianowski, Pawel
    Coope, Sam
    Spithourakis, Georgios
    Wen, Tsung-Hsien
    Mrksic, Nikola
    Su, Pei-Hao
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 5392 - 5404
  • [34] Task-Oriented Dialogue as Dataflow Synthesis
    Andreas, Jacob
    Bufe, John
    Burkett, David
    Chen, Charles
    Clausman, Josh
    Crawford, Jean
    Crim, Kate
    DeLoach, Jordan
    Dorner, Leah
    Eisner, Jason
    Fang, Hao
    Guo, Alan
    Hall, David
    Hayes, Kristin
    Hill, Kellie
    Ho, Diana
    Iwaszuk, Wendy
    Jha, Smriti
    Klein, Dan
    Krishnamurthy, Jayant
    Lanman, Theo
    Liang, Percy
    Lin, Christopher H.
    Lintsbakh, Ilya
    McGovern, Andy
    Nisnevich, Aleksandr
    Pauls, Adam
    Petters, Dmitrij
    Read, Brent
    Roth, Dan
    Roy, Subhro
    Rusak, Jesse
    Short, Beth
    Slomin, Div
    Snyder, Ben
    Striplin, Stephon
    Su, Yu
    Tellman, Zachary
    Thomson, Sam
    Vorobev, Andrei
    Witoszko, Izabela
    Wolfe, Jason
    Wray, Abby
    Zhang, Yuchen
    Zotov, Alexander
    [J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2020, 8 (08) : 556 - 571
  • [35] Knowledge discovery in task-oriented dialogue
    Puppi Wanderley, Gregory Moro
    Tacla, Cesar Augusto
    Barthes, Jean-Paul A.
    Paraiso, Emerson Cabrera
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (20) : 6807 - 6818
  • [36] Multi-task Learning for Natural Language Generation in Task-Oriented Dialogue
    Zhu, Chenguang
    Zeng, Michael
    Huang, Xuedong
    [J]. 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 1261 - 1266
  • [37] Initiative conflicts in task-oriented dialogue
    Yang, Fan
    Heeman, Peter A.
    [J]. COMPUTER SPEECH AND LANGUAGE, 2010, 24 (02): : 175 - 189
  • [38] Estimating Uncertainty in Task-Oriented Dialogue
    Kontogiorgos, Dimosthenis
    Pereira, Andre
    Gustafson, Joakim
    [J]. ICMI'19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2019, : 414 - 418
  • [39] A New Task for Predicting Emotions and Dialogue Strategies in Task-Oriented Dialogue
    Vanel, Lorraine
    Yacoubi, Alya
    Clavel, Chloe
    [J]. 2023 11TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION, ACII, 2023,
  • [40] Multi-task learning with graph attention networks for multi-domain task-oriented dialogue systems
    Zhao, Meng
    Wang, Lifang
    Jiang, Zejun
    Li, Ronghan
    Lu, Xinyu
    Hu, Zhongtian
    [J]. KNOWLEDGE-BASED SYSTEMS, 2023, 259