Evaluating the Cranfield Paradigm for Conversational Search Systems

被引:8
|
作者
Fu, Xiao [1 ]
Yilmaz, Emine [1 ,2 ]
Lipani, Aldo [1 ]
机构
[1] UCL, London, England
[2] Amazon, London, England
基金
英国工程与自然科学研究理事会;
关键词
dialogue systems; evaluation; relevance; satisfaction; GAIN-BASED EVALUATION;
D O I
10.1145/3539813.3545126
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the sequential and interactive nature of conversations, the application of traditional Information Retrieval (IR) methods like the Cranfield paradigm require stronger assumptions. When building a test collection for Ad Hoc search, it is fair to assume that the relevance judgments provided by an annotator correlate well with the relevance judgments perceived by an actual user of the search engine. However, when building a test collection for conversational search, we do not know if it is fair to assume the same. In this paper, we perform a crowdsourcing study to evaluate the applicability of the Cranfield paradigm to conversational search systems. Our main aim is to understand what is the agreement in terms of user satisfaction between the users performing a search task in a conversational search system (i.e., directly assessing the system) and the users observing the search task being performed (i.e., indirectly assessing the system). The results of this study are paramount because they underpin and guide 1) the development of more realistic user models and simulators, and 2) the design of more reliable and robust evaluation measures for conversational search systems. Our results show that there is a fair agreement between direct and indirect assessments in terms of user satisfaction and that these two kinds of assessments share similar conversational patterns. Indeed, by collecting relevance assessments for each system utterance, we tested several conversational patterns that show a promising ability to predict user satisfaction.
引用
收藏
页码:196 / 201
页数:6
相关论文
共 50 条
  • [21] DISTANCE EDUCATION AND THE CONVERSATIONAL PARADIGM - REPLY
    HOLMBERG, B
    EDUCATIONAL & TRAINING TECHNOLOGY INTERNATIONAL, 1991, 28 (01): : 71 - 73
  • [22] A conversational paradigm for multimodal human interaction
    Quek, F
    30TH APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP, PROCEEDINGS: ANALYSIS AND UNDERSTANDING OF TIME VARYING IMAGERY, 2001, : 80 - 86
  • [23] Conversational Search for Multimedia Archives
    Potyagalova, Anastasia
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT III, 2023, 13982 : 462 - 467
  • [24] A DIAGNOSTIC MODEL FOR EVALUATING RETROSPECTIVE SEARCH SYSTEMS
    KING, DW
    BRYANT, EC
    INFORMATION STORAGE AND RETRIEVAL, 1970, 6 (03): : 261 - &
  • [25] A Theoretical Framework for Conversational Search
    Radlinski, Filip
    Craswell, Nick
    CHIIR'17: PROCEEDINGS OF THE 2017 CONFERENCE HUMAN INFORMATION INTERACTION AND RETRIEVAL, 2017, : 117 - 126
  • [26] Topic Propagation in Conversational Search
    Mele, Ida
    Muntean, Cristina Ioana
    Nardini, Franco Maria
    Perego, Raffaele
    Tonellotto, Nicola
    Frieder, Ophir
    PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 2057 - 2060
  • [27] Conversational Search with Tail Entities
    Hai Dang Tran
    Yates, Andrew
    Weikum, Gerhard
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT II, 2024, 14609 : 303 - 317
  • [28] Note: Evaluating Trust in the Context of Conversational Information Systems for new users of the Internet
    Aribandi, Anurag
    Agrawal, Divyanshu
    Chakraborty, Dipanjan
    PROCEEDINGS OF THE 4TH ACM SIGCAS/SIGCHI CONFERENCE ON COMPUTING AND SUSTAINABLE SOCIETIES, COMPASS'22, 2022, : 574 - 578
  • [29] Evaluating Human-AI Hybrid Conversational Systems with Chatbot Message Suggestions
    Gao, Zihan
    Jiang, Jiepu
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 534 - 544
  • [30] Evaluating the Impact of Learner Control and Interactivity in Conversational Tutoring Systems for Persuasive Writing
    Wambsganss, Thiemo
    Benke, Ivo
    Maedche, Alexander
    Koedinger, Kenneth
    Kaser, Tanja
    INTERNATIONAL JOURNAL OF ARTIFICIAL INTELLIGENCE IN EDUCATION, 2024,