LETOR: A benchmark collection for research on learning to rank for information retrieval

被引:251
|
作者
Qin, Tao [1 ]
Liu, Tie-Yan [1 ]
Xu, Jun [1 ]
Li, Hang [1 ]
机构
[1] Microsoft Res Asia, Beijing, Peoples R China
来源
INFORMATION RETRIEVAL | 2010年 / 13卷 / 04期
关键词
Learning to rank; Information retrieval; Benchmark datasets; Feature extraction;
D O I
10.1007/s10791-009-9123-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
LETOR is a benchmark collection for the research on learning to rank for information retrieval, released by Microsoft Research Asia. In this paper, we describe the details of the LETOR collection and show how it can be used in different kinds of researches. Specifically, we describe how the document corpora and query sets in LETOR are selected, how the documents are sampled, how the learning features and meta information are extracted, and how the datasets are partitioned for comprehensive evaluation. We then compare several state-of-the-art learning to rank algorithms on LETOR, report their ranking performances, and make discussions on the results. After that, we discuss possible new research topics that can be supported by LETOR, in addition to algorithm comparison. We hope that this paper can help people to gain deeper understanding of LETOR, and enable more interesting research projects on learning to rank and related topics.
引用
收藏
页码:346 / 374
页数:29
相关论文
共 50 条
  • [31] Towards Reproducible Machine Learning Research in Information Retrieval
    Lucic, Ana
    Bleeker, Maurits
    de Rijke, Maarten
    Sinha, Koustuv
    Jullien, Sami
    Stojnic, Robert
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 3459 - 3461
  • [32] Query-dependent learning to rank for cross-lingual information retrieval
    Ghanbari, Elham
    Shakery, Azadeh
    KNOWLEDGE AND INFORMATION SYSTEMS, 2019, 59 (03) : 711 - 743
  • [33] Learning to rank diversified results for biomedical information retrieval from multiple features
    Jiajin Wu
    Jimmy Xiangji Huang
    Zheng Ye
    BioMedical Engineering OnLine, 13
  • [34] Query-dependent learning to rank for cross-lingual information retrieval
    Elham Ghanbari
    Azadeh Shakery
    Knowledge and Information Systems, 2019, 59 : 711 - 743
  • [35] LTRRS: A Learning to Rank Based Algorithm for Resource Selection in Distributed Information Retrieval
    Wu, Tianfeng
    Liu, Xiaofeng
    Dong, Shoubin
    INFORMATION RETRIEVAL (CCIR 2019), 2019, 11772 : 52 - 63
  • [36] Learning to rank diversified results for biomedical information retrieval from multiple features
    Wu, Jiajin
    Huang, Jimmy Xiangji
    Ye, Zheng
    BIOMEDICAL ENGINEERING ONLINE, 2014, 13
  • [37] Balancing exploration and exploitation in listwise and pairwise online learning to rank for information retrieval
    Katja Hofmann
    Shimon Whiteson
    Maarten de Rijke
    Information Retrieval, 2013, 16 : 63 - 90
  • [38] Balancing exploration and exploitation in listwise and pairwise online learning to rank for information retrieval
    Hofmann, Katja
    Whiteson, Shimon
    de Rijke, Maarten
    INFORMATION RETRIEVAL, 2013, 16 (01): : 63 - 90
  • [39] A set of novel HTML']HTML document quality features for Web information retrieval: Including applications to learning to rank for information retrieval
    Aydin, Ahmet
    Arslan, Ahmet
    Dincer, Bekir Taner
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 246
  • [40] Advances in information retrieval collection on the European conference on information retrieval 2023
    Kamps, Jaap
    Goeuriot, Lorraine
    Crestani, Fabio
    DISCOVER COMPUTING, 2024, 27 (01)