Incorporating User Expectations and Behavior into the Measurement of Search Effectiveness

被引:56
|
作者
Moffat, Alistair [1 ]
Bailey, Peter [2 ,4 ]
Scholer, Falk [3 ]
Thomas, Paul [2 ,4 ]
机构
[1] Univ Melbourne, Dept Comp & Informat Syst, Melbourne, Vic 3010, Australia
[2] Microsoft, Redmond, WA USA
[3] RMIT Univ, Sch Comp Sci & Informat Technol, GPO Box 2476, Melbourne, Vic 3001, Australia
[4] Microsoft Australia, 6 Natl Circuit, Barton, ACT 2600, Australia
基金
澳大利亚研究理事会;
关键词
Experimentation; Measurement; User behavior; test collections; effectiveness metric; relevance measures; query; search; INFORMATION-SEEKING; TASK COMPLEXITY;
D O I
10.1145/3052768
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Information retrieval systems aim to help users satisfy information needs. We argue that the goal of the person using the system, and the pattern of behavior that they exhibit as they proceed to attain that goal, should be incorporated into the methods and techniques used to evaluate the effectiveness of IR systems, so that the resulting effectiveness scores have a useful interpretation that corresponds to the users' search experience. In particular, we investigate the role of search task complexity, and show that it has a direct bearing on the number of relevant answer documents sought by users in response to an information need, suggesting that useful effectiveness metrics must be goal sensitive. We further suggest that user behavior while scanning results listings is affected by the rate at which their goal is being realized, and hence that appropriate effectiveness metrics must be adaptive to the presence (or not) of relevant documents in the ranking. In response to these two observations, we present a new effectiveness metric, INST, that has both of the desired properties: INST employs a parameter T, a direct measure of the user's search goal that adjusts the top-weightedness of the evaluation score; moreover, as progress towards the target T is made, the modeled user behavior is adapted, to reflect the remaining expectations. INST is experimentally compared to previous effectiveness metrics, including Average Precision (AP), Normalized Discounted Cumulative Gain (NDCG), and Rank-Biased Precision (RBP), demonstrating our claims as to INST's usefulness. Like RBP, INST is a weighted-precision metric, meaning that each score can be accompanied by a residual that quantifies the extent of the score uncertainty caused by unjudged documents. As part of our experimentation, we use crowd-sourced data and score residuals to demonstrate that a wide range of queries arise for even quite specific information needs, and that these variant queries introduce significant levels of residual uncertainty into typical experimental evaluations. These causes of variability have wide-reaching implications for experiment design, and for the construction of test collections.
引用
收藏
页数:38
相关论文
共 50 条
  • [1] Incorporating user search behavior into relevance feedback
    Ruthven, I
    Lalmas, M
    van Rijsbergen, K
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2003, 54 (06): : 529 - 549
  • [2] A METRIC FOR MEASURING WEB SEARCH RESULTS SATISFACTION INCORPORATING USER BEHAVIOR
    Yu, Jinxiu
    Lu, Yueming
    Zhang, Fangwei
    Sun, Songlin
    [J]. 2012 IEEE 2ND INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENT SYSTEMS (CCIS) VOLS 1-3, 2012, : 583 - 586
  • [3] Investigating the Relationship between In-Situ User Expectations and Web Search Behavior
    Wang, Ben
    Liu, Jiqun
    [J]. Proceedings of the Association for Information Science and Technology, 2022, 59 (01) : 827 - 829
  • [4] Interactive Search Result Clustering: A Study of User Behavior and Retrieval Effectiveness
    Gong, Xuemei
    Ke, Weimao
    Zhang, Yan
    Broussard, Ramona
    [J]. JCDL'13: PROCEEDINGS OF THE 13TH ACM/IEEE-CS JOINT CONFERENCE ON DIGITAL LIBRARIES, 2013, : 167 - 170
  • [5] Incorporating user behavior flow for user risk assessment
    Shan, Yuxiang
    Ren, Qin
    Yu, Gang
    Li, Tiantian
    Cao, Bin
    [J]. INTERNATIONAL JOURNAL OF WEB INFORMATION SYSTEMS, 2023, 19 (02) : 80 - 101
  • [6] Modeling user search behavior
    Baeza-Yates, R
    Hurtado, C
    Mendoza, M
    Dupret, G
    [J]. THIRD LATIN AMERICAN WEB CONGRESS, PROCEEDINGS, 2005, : 242 - 251
  • [7] END-USER SEARCH BEHAVIORS AND THEIR RELATIONSHIP TO SEARCH EFFECTIVENESS
    WILDEMUTH, BM
    MOORE, ME
    [J]. BULLETIN OF THE MEDICAL LIBRARY ASSOCIATION, 1995, 83 (03): : 294 - 304
  • [8] Incorporating User Grouping into Retweeting Behavior Modeling
    Zhu, Jinhai
    Ma, Shuai
    Zhang, Hui
    Hu, Chunming
    Li, Xiong
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2018, PT I, 2018, 10827 : 474 - 490
  • [9] User Behavior Analysis and User Modeling for Complex Search
    Mao, Jiaxin
    [J]. ADVANCES IN INFORMATION RETRIEVAL, ECIR 2017, 2017, 10193 : 778 - 778
  • [10] Investigating the role of in-situ user expectations in Web search
    Wang, Ben
    Liu, Jiqun
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (03)