Interactive probes: Towards action-level evaluation for dialogue systems

被引：3

作者：

Liesenfeld, Andreas ^{[1
]}

Dingemanse, Mark ^{[2
]}

机构：

[1] Radboud Univ Nijmegen, Ctr Language Studies, Erasmuspl 1, NL-6525 HT Nijmegen, Netherlands

[2] Radboud Univ Nijmegen, Nijmegen, Netherlands

来源：

DISCOURSE & COMMUNICATION | 2024年 / 18卷 / 06期

关键词：

Applied conversation analysis; conversational user interfaces; dialogue systems; usability testing; REPAIR;

D O I：

10.1177/17504813241267071

中图分类号：

G2 [信息与知识传播];

学科分类号：

05 ; 0503 ;

摘要：

Measures of 'humanness', 'coherence' or 'fluency' are the mainstay of dialogue system evaluation, but they don't target system capabilities and rarely offer actionable feedback. Reviewing recent work in this domain, we identify an opportunity for evaluation at the level of action sequences, rather than the more commonly targeted levels of whole conversations or single responses. We introduce interactive probes, an evaluation framework inspired by empirical work on social interaction that can help to systematically probe the capabilities of dialogue systems. We sketch some first probes in the domains of tellings and repair, two sequence types ubiquitous in human interaction and challenging for dialogue systems. We argue interactive probing can offer the requisite flexibility to keep up with developments in interactive language technologies and do justice to the open-endedness of action formation and ascription in interaction.

引用

页码：954 / 964

页数：11

共 50 条

[41] Reliable Evaluation of Multimodal Dialogue Systems
Metze, Florian
Wechsung, Ina
Schaffer, Stefan
Seebode, Julia
Moeller, Sebastian
HUMAN-COMPUTER INTERACTION, PT II: NOVEL INTERACTION METHODS AND TECHNIQUES, 2009, 5611 : 75 - +
[42] Interactive learning systems evaluation
Deeson, E
BRITISH JOURNAL OF EDUCATIONAL TECHNOLOGY, 2004, 35 (02) : 249 - 250
[43] Towards Personalised and Document-level Machine Translation of Dialogue
Vincent, Sebastian T.
EACL 2021: THE 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2021, : 137 - 147
[44] SD-TEAM: Interactive Learning, Self-Evaluation and Multimodal Technologies for Multidomain Spoken Dialogue Systems
Justo, R.
Torres, M. I.
Lleida, E.
Sanchis, E.
de Cordoba, R.
Macias-Guarasa, J.
PROCESAMIENTO DEL LENGUAJE NATURAL, 2010, (45): : 331 - 332
[45] Towards robust agent-based dialogue systems
Allen, J
2005 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2005, : 4 - 4
[46] ESCoT: Towards Interpretable Emotional Support Dialogue Systems
Zhang, Tenggan
Zhang, Xinjie
Zhao, Jinming
Zhou, Li
Jin, Qitao
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 13395 - 13412
[47] Towards human-like spoken dialogue systems
Edlund, Jens
Gustafson, Joakim
Heldner, Mattias
Hjalmarsson, Anna
SPEECH COMMUNICATION, 2008, 50 (8-9) : 630 - 645
[48] Towards optimization of the coverage testing of interactive systems
Belli, F
Budnik, CJ
PROCEEDINGS OF THE 28TH ANNUAL INTERNATIONAL COMPUTER SOFTWARE AND APPLICATION CONFERENCE, WORKSHOP AND FAST ABSTRACTS, 2004, : 18 - 19
[49] Towards Credible Human Evaluation of Open-Domain Dialog Systems Using Interactive Setup
Liu, Sijia
Lange, Patrick
Hedayatnia, Behnam
Papangelis, Alexandros
Jin, Di
Wirth, Andrew
Liu, Yang
Hakkani-Tur, Dilek
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 11, 2023, : 13264 - 13272
[50] Language to Action: Towards Interactive Task Learning with Physical Agents
Chai, Joyce Y.
Gao, Qiaozi
She, Lanbo
Yang, Shaohua
Saba-Sadiya, Sari
Xu, Guangyue
PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 2 - 9

← 1 2 3 4 5 →