MEASURING THE RELIABILITY OF RANKING IN INFORMATION RETRIEVAL SYSTEMS EVALUATION

被引:0
|
作者
Rajagopal, Prabha [1 ]
Ravana, Sri Devi [1 ]
机构
[1] Univ Malaya, Kuala Lumpur 50603, Malaysia
关键词
Information Retrieval; System Evaluation; Reliability Testing; Intraclass Correlation Coefficient; TREC; Information Systems;
D O I
10.22452/mjcs.vol32no4.1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A reliable system is crucial in satisfying users' need, but the reliability is dependent on the varying effects of the test collection. The reliability is usually evaluated by the similarities of a set of system rankings to understand the impact of variations in relevance to judgments or effectiveness metrics. However, such evaluations do not indicate the reliability of individual system rankings. This study proposes a method to measure the reliability of individual retrieval systems based on their relative rankings. The Intraclass Correlation Coefficient (ICC) is used as a reliability measure of individual system ranks. Various combination of effectiveness metrics according to their clusters, selection of topic sizes, and Kendall's tau correlation coefficient with the gold standard are experimented. The metrics average precision (AP) and rank-biased precision (RBP) are suitable for measuring the reliability of system rankings and generalizing the outcome with other similar metrics. Highly reliable system rankings belong mostly to the top and mid performing systems and are strongly correlated with the gold standard system ranks. The proposed method can be replicated to other test collections as it utilizes relative ranking in measuring reliability. The study measures the ranking reliability of individual retrieval systems to indicate the level of reliability a user can consume from the retrieval system regardless of its performance.
引用
收藏
页码:253 / 268
页数:16
相关论文
共 50 条
  • [1] Automatic Ranking of Information Retrieval Systems
    Hasanain, Maram
    [J]. WSDM'18: PROCEEDINGS OF THE ELEVENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2018, : 749 - 750
  • [2] CURE: Collection for Urdu Information Retrieval Evaluation and Ranking
    Iqbal, Muntaha
    Tahir, Bilal
    Mehmood, Muhammad Amir
    [J]. 2021 INTERNATIONAL CONFERENCE ON DIGITAL FUTURES AND TRANSFORMATIVE TECHNOLOGIES (ICODT2), 2021,
  • [3] Automatic ranking of information retrieval systems using data fusion
    Nuray, R
    Can, F
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2006, 42 (03) : 595 - 614
  • [4] Calculation of reliability of information-measuring systems
    Kondratev, AE
    Popova, GE
    [J]. IZVESTIYA VYSSHIKH UCHEBNYKH ZAVEDENII AVIATSIONAYA TEKHNIKA, 1995, (03): : 10 - 13
  • [5] EVALUATION OF INFORMATION-RETRIEVAL SYSTEMS
    FARRADANE, J
    [J]. JOURNAL OF DOCUMENTATION, 1974, 30 (02) : 195 - 209
  • [6] On the evaluation of Geographic Information Retrieval systems
    Palacio, Damien
    Cabanac, Guillaume
    Sallaberry, Christian
    Hubert, Gilles
    [J]. INTERNATIONAL JOURNAL ON DIGITAL LIBRARIES, 2010, 11 (02) : 91 - 109
  • [7] Information retrieval in folksonomies:: Search and ranking
    Hotho, Andreas
    Jaeschke, Robert
    Schmitz, Christoph
    Stumme, Christoph
    [J]. SEMANTIC WEB: RESEARCH AND APPLICATIONS, PROCEEDINGS, 2006, 4011 : 411 - 426
  • [8] A METHODOLOGY FOR TEST AND EVALUATION OF INFORMATION RETRIEVAL SYSTEMS
    GOFFMAN, W
    NEWILL, VA
    [J]. INFORMATION STORAGE AND RETRIEVAL, 1966, 3 (01): : 19 - +
  • [9] A new evaluation measure for information retrieval systems
    Mehlitz, Martin
    Bauckhage, Christian
    Kunegis, Jerome
    Albayrak, Sahin
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-8, 2007, : 517 - +
  • [10] COMMUNICATION THEORY AND EVALUATION OF INFORMATION RETRIEVAL SYSTEMS
    MEETHAM, AR
    [J]. INFORMATION STORAGE AND RETRIEVAL, 1969, 5 (03): : 129 - +