Interactive probes: Towards action-level evaluation for dialogue systems

被引：3

作者：

Liesenfeld, Andreas ^{[1
]}

Dingemanse, Mark ^{[2
]}

机构：

[1] Radboud Univ Nijmegen, Ctr Language Studies, Erasmuspl 1, NL-6525 HT Nijmegen, Netherlands

[2] Radboud Univ Nijmegen, Nijmegen, Netherlands

来源：

DISCOURSE & COMMUNICATION | 2024年 / 18卷 / 06期

关键词：

Applied conversation analysis; conversational user interfaces; dialogue systems; usability testing; REPAIR;

D O I：

10.1177/17504813241267071

中图分类号：

G2 [信息与知识传播];

学科分类号：

05 ; 0503 ;

摘要：

Measures of 'humanness', 'coherence' or 'fluency' are the mainstay of dialogue system evaluation, but they don't target system capabilities and rarely offer actionable feedback. Reviewing recent work in this domain, we identify an opportunity for evaluation at the level of action sequences, rather than the more commonly targeted levels of whole conversations or single responses. We introduce interactive probes, an evaluation framework inspired by empirical work on social interaction that can help to systematically probe the capabilities of dialogue systems. We sketch some first probes in the domains of tellings and repair, two sequence types ubiquitous in human interaction and challenging for dialogue systems. We argue interactive probing can offer the requisite flexibility to keep up with developments in interactive language technologies and do justice to the open-endedness of action formation and ascription in interaction.

引用

页码：954 / 964

页数：11

共 50 条

[1] Action-Level Intention Selection for BDI Agents
Yao, Yuan
Logan, Brian
AAMAS'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2016, : 1227 - 1236
[2] Dialog-to-Actions: Building Task-Oriented Dialogue System via Action-Level Generation
Hua, Yuncheng
Xi, Xiangyu
Jiang, Zheng
Zhang, Guanwei
Sun, Chaobo
Wan, Guanglu
Ye, Wei
PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 3255 - 3259
[3] Design and verification of SystemC trans action-level models
Habibi, A
Tahar, S
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2006, 14 (01) : 57 - 68
[4] Contextual Interactive Evaluation of TTS Models in Dialogue Systems
Wang, Siyang
Szekely, Eva
Gustafson, Joakim
INTERSPEECH 2024, 2024, : 2965 - 2969
[5] Evidence for action-level imitation of temporally morphed throwing movements
Lestou, V
Pollick, FE
Vogt, S
JOURNAL OF COGNITIVE NEUROSCIENCE, 2002, : 116 - 116
[6] Modeling Action-level Satisfaction for Search Task Satisfaction Prediction
Wang, Hongning
Song, Yang
Chang, Ming-Wei
He, Xiaodong
Hassan, Ahmed
White, Ryen W.
SIGIR'14: PROCEEDINGS OF THE 37TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2014, : 123 - 132
[7] An Action-Level Assistant for Robotic Manipulation: User Experience and Performance
Martins, Diogo
Aldhaheri, Sara
Kopanev, Pavel
Pairet, Eric
Ardon, Paola
Sa, Alirio
2023 XIII BRAZILIAN SYMPOSIUM ON COMPUTING SYSTEMS ENGINEERING, SBESC, 2023,
[8] Action-level real-time DEVS modeling and simulation
Sarjoughian, Hessam S.
Gholami, Soroosh
SIMULATION-TRANSACTIONS OF THE SOCIETY FOR MODELING AND SIMULATION INTERNATIONAL, 2015, 91 (10): : 869 - 887
[9] Action-Level Real-Time Network-on-Chip Modeling
Gholami, Soroosh
Sarjoughian, Hessam S.
SIMULATION MODELLING PRACTICE AND THEORY, 2017, 77 : 272 - 291
[10] A COMPARISON BETWEEN OSHA-COMPLIANCE CRITERIA AND ACTION-LEVEL DECISION CRITERIA
ROCK, JC
AMERICAN INDUSTRIAL HYGIENE ASSOCIATION JOURNAL, 1982, 43 (05): : 297 - 313

← 1 2 3 4 5 →