Evaluating the Cranfield Paradigm for Conversational Search Systems

被引:8
|
作者
Fu, Xiao [1 ]
Yilmaz, Emine [1 ,2 ]
Lipani, Aldo [1 ]
机构
[1] UCL, London, England
[2] Amazon, London, England
基金
英国工程与自然科学研究理事会;
关键词
dialogue systems; evaluation; relevance; satisfaction; GAIN-BASED EVALUATION;
D O I
10.1145/3539813.3545126
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the sequential and interactive nature of conversations, the application of traditional Information Retrieval (IR) methods like the Cranfield paradigm require stronger assumptions. When building a test collection for Ad Hoc search, it is fair to assume that the relevance judgments provided by an annotator correlate well with the relevance judgments perceived by an actual user of the search engine. However, when building a test collection for conversational search, we do not know if it is fair to assume the same. In this paper, we perform a crowdsourcing study to evaluate the applicability of the Cranfield paradigm to conversational search systems. Our main aim is to understand what is the agreement in terms of user satisfaction between the users performing a search task in a conversational search system (i.e., directly assessing the system) and the users observing the search task being performed (i.e., indirectly assessing the system). The results of this study are paramount because they underpin and guide 1) the development of more realistic user models and simulators, and 2) the design of more reliable and robust evaluation measures for conversational search systems. Our results show that there is a fair agreement between direct and indirect assessments in terms of user satisfaction and that these two kinds of assessments share similar conversational patterns. Indeed, by collecting relevance assessments for each system utterance, we tested several conversational patterns that show a promising ability to predict user satisfaction.
引用
收藏
页码:196 / 201
页数:6
相关论文
共 50 条
  • [41] Adaptive utterance rewriting for conversational search
    Mele, Ida
    Muntean, Cristina Ioana
    Nardini, Franco Maria
    Perego, Raffaele
    Tonellotto, Nicola
    Frieder, Ophir
    INFORMATION PROCESSING & MANAGEMENT, 2021, 58 (06)
  • [42] Exploring the economics of conversational search sessions
    Ghosh, Souvick
    Gogoi, Julie
    Chua, Kristen
    ASLIB JOURNAL OF INFORMATION MANAGEMENT, 2024, 76 (04) : 613 - 628
  • [43] A Survey on Conversational Search and Applications in Biomedicine
    Adatrao, Naga Sai Krishna
    Gadireddy, Gowtham Reddy
    Noh, Jiho
    PROCEEDINGS OF THE 2023 ACM SOUTHEAST CONFERENCE, ACMSE 2023, 2023, : 78 - 88
  • [44] Meta-Information in Conversational Search
    Kiesel, Johannes
    Meyer, Lars
    Potthast, Martin
    Stein, Benno
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2021, 39 (04)
  • [45] Optimising attribute selection in conversational search
    Teixeira, D
    Verhaegh, W
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2003, 2807 : 138 - 145
  • [46] Simulating and Modeling the Risk of Conversational Search
    Wang, Zhenduo
    Ai, Qingyao
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2022, 40 (04)
  • [47] A Conversational Search Framework for Multimedia Archives
    Potyagalova, Anastasia
    Jones, Gareth J. F.
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT V, 2024, 14612 : 241 - 245
  • [48] An Interface for Agent Supported Conversational Search
    Kaushik, Abhishek
    Ramachandra, Vishal Bhat
    Jones, Gareth J. F.
    CHIIR'20: PROCEEDINGS OF THE 2020 CONFERENCE ON HUMAN INFORMATION INTERACTION AND RETRIEVAL, 2020, : 452 - 456
  • [49] Towards a model for spoken conversational search
    Trippas, Johanne R.
    Spina, Damiano
    Thomas, Paul
    Sanderson, Mark
    Joho, Hideo
    Cavedon, Lawrence
    INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (02)
  • [50] Ranking Manipulation for Conversational Search Engines
    Pfrommer, Samuel
    Bai, Yatong
    Gautam, Tanmay
    Sojoudi, Somayeh
    arXiv,