LETOR: A benchmark collection for research on learning to rank for information retrieval

被引:251
|
作者
Qin, Tao [1 ]
Liu, Tie-Yan [1 ]
Xu, Jun [1 ]
Li, Hang [1 ]
机构
[1] Microsoft Res Asia, Beijing, Peoples R China
来源
INFORMATION RETRIEVAL | 2010年 / 13卷 / 04期
关键词
Learning to rank; Information retrieval; Benchmark datasets; Feature extraction;
D O I
10.1007/s10791-009-9123-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
LETOR is a benchmark collection for the research on learning to rank for information retrieval, released by Microsoft Research Asia. In this paper, we describe the details of the LETOR collection and show how it can be used in different kinds of researches. Specifically, we describe how the document corpora and query sets in LETOR are selected, how the documents are sampled, how the learning features and meta information are extracted, and how the datasets are partitioned for comprehensive evaluation. We then compare several state-of-the-art learning to rank algorithms on LETOR, report their ranking performances, and make discussions on the results. After that, we discuss possible new research topics that can be supported by LETOR, in addition to algorithm comparison. We hope that this paper can help people to gain deeper understanding of LETOR, and enable more interesting research projects on learning to rank and related topics.
引用
收藏
页码:346 / 374
页数:29
相关论文
共 50 条
  • [21] Balancing Speed and Quality in Online Learning to Rank for Information Retrieval
    Oosterhuis, Harrie
    de Rijke, Maarten
    CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 277 - 286
  • [22] Efficient margin-based rank learning algorithms for information retrieval
    Yan, Rong
    Hauptmann, Alexander G.
    IMAGE AND VIDEO RETRIEVAL, PROCEEDINGS, 2006, 4071 : 113 - 122
  • [23] Improve Biomedical Information Retrieval Using Modified Learning to Rank Methods
    Xu, Bo
    Lin, Hongfei
    Lin, Yuan
    Ma, Yunlong
    Yang, Liang
    Wang, Jian
    Yang, Zhihao
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018, 15 (06) : 1797 - 1809
  • [24] Learning To Rank Relevant Documents for Information Retrieval in Bioengineering Text Corpora
    Cheng, Kowk Sun
    Song, Myoungkyu
    2021 IEEE 45TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2021), 2021, : 1565 - 1572
  • [25] Learning to Rank for Information Retrieval and Natural Language Processing, Second Edition
    Huawei Technologies, China
    Synth. Lect. Human Lang. Technol., 3 (1-123): : 1 - 123
  • [26] Learning to Rank in Generative Retrieval
    Li, Yongqi
    Yang, Nan
    Wang, Liang
    Wei, Furu
    Li, Wenjie
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 8, 2024, : 8716 - 8723
  • [27] A Bayesian framework for XML information retrieval: Searching and learning with the INEX collection
    Piwowarski, B
    Gallinari, P
    INFORMATION RETRIEVAL, 2005, 8 (04): : 655 - 681
  • [28] A Bayesian Framework for XML Information Retrieval: Searching and Learning with the INEX Collection
    Benjamin Piwowarski
    Patrick Gallinari
    Information Retrieval, 2005, 8 : 655 - 681
  • [29] UNLV-ISRI document collection for research in OCR and information retrieval
    Taghva, K
    Nartker, T
    Borsack, J
    Condit, A
    DOCUMENT RECOGNITION AND RETRIEVAL VII, 2000, 3967 : 157 - 164
  • [30] Rank-order-correlation-based feature vector context transformation for learning to rank for information retrieval
    Yeh, Jen-Yuan
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2018, 33 (01): : 41 - 52