Current Developments in Information Retrieval Evaluation

被引:0
|
作者
Madl, Thomas [1 ]
机构
[1] Univ Hildesheim, Hildesheim, Germany
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the last decade, many evaluationr results have been created within the evaluation initiatives like TREC, NTCIR and CLEF. The large amount of data available has led to substantial research on the validity of the evaluation procedure. An evaluation based on the Cranfield paradigm requires basically topics as descriptions of information needs, a document collection, systems to compare, human jurors to judge the documents retrieved by the systems against the information needs descriptions and some metric to compare the systems. For all these elements, there has been a scientific discussion. How many topics, systems, jurors and juror decisions are necessary to achieve valid results? How can the validity be measured? Which metrics are the most reliable ones and which metrics are appropriate from a user perspective? Examples from current CLEF experiments are used to illustrate some of the issues. User based evaluations confront test users with the results of search systems and let them solve information tasks given in the experiment. In such a test setting, the performance of the user can be measured by observing the number of relevant documents he finds. This measure can be compared to a gold standard of relevance for the search topic to see if the perceived performance corrleates with an objective notion of relevance defined by a juror. In addition, the user can be asked about his satisfaction with the search system and its results. In recent years, there has a growing concern on how well the results of batch and user studies correlate. When systems improve in a batch comparison and bring more relevant documents into the results list, do users get a benefit from this improvement? Are users more satisfied with better result lists and do better systems enable them to find more relevant documents? Some studies could not confirm this relations between system performance and user satisfaction.
引用
收藏
页码:806 / 809
页数:4
相关论文
共 50 条
  • [1] Current status of the evaluation of information retrieval
    Kagolovsky, Y
    Moehr, JR
    [J]. JOURNAL OF MEDICAL SYSTEMS, 2003, 27 (05) : 409 - 424
  • [2] Current Status of the Evaluation of Information Retrieval
    Yuri Kagolovsky
    Jochen R. Moehr
    [J]. Journal of Medical Systems, 2003, 27 : 409 - 424
  • [3] Recent Developments in Information Retrieval
    Gurrin, Cathal
    He, Yulan
    Kazai, Gabriella
    Kruschwitz, Udo
    Little, Suzanne
    Roelleke, Thomas
    Rueger, Stefan
    van Rijsbergen, Keith
    [J]. ADVANCES IN INFORMATION RETRIEVAL, PROCEEDINGS, 2010, 5993 : 1 - +
  • [4] ZATOCODING AND DEVELOPMENTS IN INFORMATION-RETRIEVAL
    MOOERS, CN
    [J]. ASLIB PROCEEDINGS, 1956, 8 (01): : 3 - 22
  • [5] DISTRIBUTED INFORMATION RETRIEVAL: DEVELOPMENTS AND STRATEGIES
    Ghansah, Benjamin
    Wu, Shengli
    [J]. INTERNATIONAL JOURNAL OF ENGINEERING RESEARCH IN AFRICA, 2015, 16 (110-144) : 110 - 144
  • [6] DEVELOPMENTS IN INFORMATION-RETRIEVAL AND DISTRIBUTION
    REYWATSON, J
    LASKER, BM
    SHAMES, PMB
    BUTLER, L
    MARSDEN, BG
    HALBWACHS, JL
    BENN, CR
    SHOBBROOK, RM
    [J]. HIGHLIGHTS OF ASTRONOMY, VOL 8, 1989, : 77 - 83
  • [7] Information Retrieval Evaluation
    Hartley, Dick
    [J]. EDUCATION FOR INFORMATION, 2011, 28 (2-4) : 341 - 342
  • [8] Music Information Retrieval: Recent Developments and Applications
    不详
    [J]. FOUNDATIONS AND TRENDS IN INFORMATION RETRIEVAL, 2014, 8 (2-3): : 128 - +
  • [9] Recent Developments in the Evaluation of Information Retrieval Systems: Moving Towards Diversity and Practical Relevance
    Mandl, Thomas
    [J]. INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS, 2008, 32 (01): : 27 - 38
  • [10] Retrieval status values in information retrieval evaluation
    Imafouo, Amelie
    Tannier, Xavier
    [J]. STRING PROCESSING AND INFORMATION RETRIEVAL, PROCEEDINGS, 2005, 3772 : 224 - 227