Interactive probes: Towards action-level evaluation for dialogue systems

被引:3
|
作者
Liesenfeld, Andreas [1 ]
Dingemanse, Mark [2 ]
机构
[1] Radboud Univ Nijmegen, Ctr Language Studies, Erasmuspl 1, NL-6525 HT Nijmegen, Netherlands
[2] Radboud Univ Nijmegen, Nijmegen, Netherlands
关键词
Applied conversation analysis; conversational user interfaces; dialogue systems; usability testing; REPAIR;
D O I
10.1177/17504813241267071
中图分类号
G2 [信息与知识传播];
学科分类号
05 ; 0503 ;
摘要
Measures of 'humanness', 'coherence' or 'fluency' are the mainstay of dialogue system evaluation, but they don't target system capabilities and rarely offer actionable feedback. Reviewing recent work in this domain, we identify an opportunity for evaluation at the level of action sequences, rather than the more commonly targeted levels of whole conversations or single responses. We introduce interactive probes, an evaluation framework inspired by empirical work on social interaction that can help to systematically probe the capabilities of dialogue systems. We sketch some first probes in the domains of tellings and repair, two sequence types ubiquitous in human interaction and challenging for dialogue systems. We argue interactive probing can offer the requisite flexibility to keep up with developments in interactive language technologies and do justice to the open-endedness of action formation and ascription in interaction.
引用
收藏
页码:954 / 964
页数:11
相关论文
共 50 条
  • [1] Action-Level Intention Selection for BDI Agents
    Yao, Yuan
    Logan, Brian
    AAMAS'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2016, : 1227 - 1236
  • [2] Dialog-to-Actions: Building Task-Oriented Dialogue System via Action-Level Generation
    Hua, Yuncheng
    Xi, Xiangyu
    Jiang, Zheng
    Zhang, Guanwei
    Sun, Chaobo
    Wan, Guanglu
    Ye, Wei
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 3255 - 3259
  • [3] Design and verification of SystemC trans action-level models
    Habibi, A
    Tahar, S
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2006, 14 (01) : 57 - 68
  • [4] Contextual Interactive Evaluation of TTS Models in Dialogue Systems
    Wang, Siyang
    Szekely, Eva
    Gustafson, Joakim
    INTERSPEECH 2024, 2024, : 2965 - 2969
  • [5] Evidence for action-level imitation of temporally morphed throwing movements
    Lestou, V
    Pollick, FE
    Vogt, S
    JOURNAL OF COGNITIVE NEUROSCIENCE, 2002, : 116 - 116
  • [6] Modeling Action-level Satisfaction for Search Task Satisfaction Prediction
    Wang, Hongning
    Song, Yang
    Chang, Ming-Wei
    He, Xiaodong
    Hassan, Ahmed
    White, Ryen W.
    SIGIR'14: PROCEEDINGS OF THE 37TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2014, : 123 - 132
  • [7] An Action-Level Assistant for Robotic Manipulation: User Experience and Performance
    Martins, Diogo
    Aldhaheri, Sara
    Kopanev, Pavel
    Pairet, Eric
    Ardon, Paola
    Sa, Alirio
    2023 XIII BRAZILIAN SYMPOSIUM ON COMPUTING SYSTEMS ENGINEERING, SBESC, 2023,
  • [8] Action-level real-time DEVS modeling and simulation
    Sarjoughian, Hessam S.
    Gholami, Soroosh
    SIMULATION-TRANSACTIONS OF THE SOCIETY FOR MODELING AND SIMULATION INTERNATIONAL, 2015, 91 (10): : 869 - 887
  • [9] Action-Level Real-Time Network-on-Chip Modeling
    Gholami, Soroosh
    Sarjoughian, Hessam S.
    SIMULATION MODELLING PRACTICE AND THEORY, 2017, 77 : 272 - 291
  • [10] A COMPARISON BETWEEN OSHA-COMPLIANCE CRITERIA AND ACTION-LEVEL DECISION CRITERIA
    ROCK, JC
    AMERICAN INDUSTRIAL HYGIENE ASSOCIATION JOURNAL, 1982, 43 (05): : 297 - 313