Evaluating the Cranfield Paradigm for Conversational Search Systems

被引：8

作者：

Fu, Xiao ^{[1
]}

Yilmaz, Emine ^{[1
,2
]}

Lipani, Aldo ^{[1
]}

机构：

[1] UCL, London, England

[2] Amazon, London, England

来源：

PROCEEDINGS OF THE 2022 ACM SIGIR INTERNATIONAL CONFERENCE ON THE THEORY OF INFORMATION RETRIEVAL, ICTIR 2022 | 2022年

基金：

英国工程与自然科学研究理事会;

关键词：

dialogue systems; evaluation; relevance; satisfaction; GAIN-BASED EVALUATION;

D O I：

10.1145/3539813.3545126

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Due to the sequential and interactive nature of conversations, the application of traditional Information Retrieval (IR) methods like the Cranfield paradigm require stronger assumptions. When building a test collection for Ad Hoc search, it is fair to assume that the relevance judgments provided by an annotator correlate well with the relevance judgments perceived by an actual user of the search engine. However, when building a test collection for conversational search, we do not know if it is fair to assume the same. In this paper, we perform a crowdsourcing study to evaluate the applicability of the Cranfield paradigm to conversational search systems. Our main aim is to understand what is the agreement in terms of user satisfaction between the users performing a search task in a conversational search system (i.e., directly assessing the system) and the users observing the search task being performed (i.e., indirectly assessing the system). The results of this study are paramount because they underpin and guide 1) the development of more realistic user models and simulators, and 2) the design of more reliable and robust evaluation measures for conversational search systems. Our results show that there is a fair agreement between direct and indirect assessments in terms of user satisfaction and that these two kinds of assessments share similar conversational patterns. Indeed, by collecting relevance assessments for each system utterance, we tested several conversational patterns that show a promising ability to predict user satisfaction.

引用

页码：196 / 201

页数：6

共 50 条

[41] Adaptive utterance rewriting for conversational search
Mele, Ida
Muntean, Cristina Ioana
Nardini, Franco Maria
Perego, Raffaele
Tonellotto, Nicola
Frieder, Ophir
INFORMATION PROCESSING & MANAGEMENT, 2021, 58 (06)
[42] Exploring the economics of conversational search sessions
Ghosh, Souvick
Gogoi, Julie
Chua, Kristen
ASLIB JOURNAL OF INFORMATION MANAGEMENT, 2024, 76 (04) : 613 - 628
[43] A Survey on Conversational Search and Applications in Biomedicine
Adatrao, Naga Sai Krishna
Gadireddy, Gowtham Reddy
Noh, Jiho
PROCEEDINGS OF THE 2023 ACM SOUTHEAST CONFERENCE, ACMSE 2023, 2023, : 78 - 88
[44] Meta-Information in Conversational Search
Kiesel, Johannes
Meyer, Lars
Potthast, Martin
Stein, Benno
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2021, 39 (04)
[45] Optimising attribute selection in conversational search
Teixeira, D
Verhaegh, W
TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2003, 2807 : 138 - 145
[46] Simulating and Modeling the Risk of Conversational Search
Wang, Zhenduo
Ai, Qingyao
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2022, 40 (04)
[47] A Conversational Search Framework for Multimedia Archives
Potyagalova, Anastasia
Jones, Gareth J. F.
ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT V, 2024, 14612 : 241 - 245
[48] An Interface for Agent Supported Conversational Search
Kaushik, Abhishek
Ramachandra, Vishal Bhat
Jones, Gareth J. F.
CHIIR'20: PROCEEDINGS OF THE 2020 CONFERENCE ON HUMAN INFORMATION INTERACTION AND RETRIEVAL, 2020, : 452 - 456
[49] Towards a model for spoken conversational search
Trippas, Johanne R.
Spina, Damiano
Thomas, Paul
Sanderson, Mark
Joho, Hideo
Cavedon, Lawrence
INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (02)
[50] Ranking Manipulation for Conversational Search Engines
Pfrommer, Samuel
Bai, Yatong
Gautam, Tanmay
Sojoudi, Somayeh
arXiv,

← 1 2 3 4 5 →