Interactive probes: Towards action-level evaluation for dialogue systems

被引:3
|
作者
Liesenfeld, Andreas [1 ]
Dingemanse, Mark [2 ]
机构
[1] Radboud Univ Nijmegen, Ctr Language Studies, Erasmuspl 1, NL-6525 HT Nijmegen, Netherlands
[2] Radboud Univ Nijmegen, Nijmegen, Netherlands
关键词
Applied conversation analysis; conversational user interfaces; dialogue systems; usability testing; REPAIR;
D O I
10.1177/17504813241267071
中图分类号
G2 [信息与知识传播];
学科分类号
05 ; 0503 ;
摘要
Measures of 'humanness', 'coherence' or 'fluency' are the mainstay of dialogue system evaluation, but they don't target system capabilities and rarely offer actionable feedback. Reviewing recent work in this domain, we identify an opportunity for evaluation at the level of action sequences, rather than the more commonly targeted levels of whole conversations or single responses. We introduce interactive probes, an evaluation framework inspired by empirical work on social interaction that can help to systematically probe the capabilities of dialogue systems. We sketch some first probes in the domains of tellings and repair, two sequence types ubiquitous in human interaction and challenging for dialogue systems. We argue interactive probing can offer the requisite flexibility to keep up with developments in interactive language technologies and do justice to the open-endedness of action formation and ascription in interaction.
引用
收藏
页码:954 / 964
页数:11
相关论文
共 50 条
  • [41] Reliable Evaluation of Multimodal Dialogue Systems
    Metze, Florian
    Wechsung, Ina
    Schaffer, Stefan
    Seebode, Julia
    Moeller, Sebastian
    HUMAN-COMPUTER INTERACTION, PT II: NOVEL INTERACTION METHODS AND TECHNIQUES, 2009, 5611 : 75 - +
  • [42] Interactive learning systems evaluation
    Deeson, E
    BRITISH JOURNAL OF EDUCATIONAL TECHNOLOGY, 2004, 35 (02) : 249 - 250
  • [43] Towards Personalised and Document-level Machine Translation of Dialogue
    Vincent, Sebastian T.
    EACL 2021: THE 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2021, : 137 - 147
  • [44] SD-TEAM: Interactive Learning, Self-Evaluation and Multimodal Technologies for Multidomain Spoken Dialogue Systems
    Justo, R.
    Torres, M. I.
    Lleida, E.
    Sanchis, E.
    de Cordoba, R.
    Macias-Guarasa, J.
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2010, (45): : 331 - 332
  • [45] Towards robust agent-based dialogue systems
    Allen, J
    2005 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2005, : 4 - 4
  • [46] ESCoT: Towards Interpretable Emotional Support Dialogue Systems
    Zhang, Tenggan
    Zhang, Xinjie
    Zhao, Jinming
    Zhou, Li
    Jin, Qitao
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 13395 - 13412
  • [47] Towards human-like spoken dialogue systems
    Edlund, Jens
    Gustafson, Joakim
    Heldner, Mattias
    Hjalmarsson, Anna
    SPEECH COMMUNICATION, 2008, 50 (8-9) : 630 - 645
  • [48] Towards optimization of the coverage testing of interactive systems
    Belli, F
    Budnik, CJ
    PROCEEDINGS OF THE 28TH ANNUAL INTERNATIONAL COMPUTER SOFTWARE AND APPLICATION CONFERENCE, WORKSHOP AND FAST ABSTRACTS, 2004, : 18 - 19
  • [49] Towards Credible Human Evaluation of Open-Domain Dialog Systems Using Interactive Setup
    Liu, Sijia
    Lange, Patrick
    Hedayatnia, Behnam
    Papangelis, Alexandros
    Jin, Di
    Wirth, Andrew
    Liu, Yang
    Hakkani-Tur, Dilek
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 11, 2023, : 13264 - 13272
  • [50] Language to Action: Towards Interactive Task Learning with Physical Agents
    Chai, Joyce Y.
    Gao, Qiaozi
    She, Lanbo
    Yang, Shaohua
    Saba-Sadiya, Sari
    Xu, Guangyue
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 2 - 9