Generalized Ensemble Model for Document Ranking in Information Retrieval

被引:3
|
作者
Wang, Yanshan [1 ]
Choi, In-Chan [2 ]
Liu, Hongfang [1 ]
机构
[1] Mayo Clin, Dept Hlth Sci Res, Rochester, MN 55905 USA
[2] Korea Univ, Sch Ind Management Engn, Seoul 136701, South Korea
关键词
information retrieval; optimization; mean average precision; document ranking; ensemble model;
D O I
10.2298/CSIS160229042W
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A generalized ensemble model (gEnM) for document ranking is proposed in this paper. The gEnM linearly combines the document retrieval models and tries to retrieve relevant documents at high positions. In order to obtain the optimal linear combination of multiple document retrieval models or rankers, an optimization program is formulated by directly maximizing the mean average precision. Both supervised and unsupervised learning algorithms are presented to solve this program. For the supervised scheme, two approaches are considered based on the data setting, namely batch and online setting. In the batch setting, we propose a revised Newton's algorithm, gEnM. BAT, by approximating the derivative and Hessian matrix. In the online setting, we advocate a stochastic gradient descent (SGD) based algorithm-gEnM. ON. As for the unsupervised scheme, an unsupervised ensemble model (UnsEnM) by iteratively co-learning from each constituent ranker is presented. Experimental study on benchmark data sets verifies the effectiveness of the proposed algorithms. Therefore, with appropriate algorithms, the gEnM is a viable option in diverse practical information retrieval applications.
引用
收藏
页码:123 / 151
页数:29
相关论文
共 50 条
  • [11] Contextualisation of information retrieval process and document ranking task in web search tools
    Bouramoul, Abdelkrim
    [J]. INTERNATIONAL JOURNAL OF SPACE-BASED AND SITUATED COMPUTING, 2016, 6 (02) : 74 - 89
  • [12] Two-level document ranking using mutual information in natural language information retrieval
    Kang, HK
    Choi, KS
    [J]. INFORMATION PROCESSING & MANAGEMENT, 1997, 33 (03) : 289 - 306
  • [13] Generalized scientific and technological information retrieval and extraction based on document evaluation
    Li, K.
    [J]. BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2018, 123 : 50 - 51
  • [14] A new Retrieval ranking method based on Document retrieval expected value in Chinese document
    Wang, Tao
    Chen, Mei
    Jiang, Yan
    Wang, Hanhu
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, 2008, : 367 - 371
  • [15] Research on Personalized Document Retrieval and Ranking Strategy
    Tang, Hai
    Hu, Zhihui
    [J]. PROCEEDINGS OF 2019 IEEE 8TH JOINT INTERNATIONAL INFORMATION TECHNOLOGY AND ARTIFICIAL INTELLIGENCE CONFERENCE (ITAIC 2019), 2019, : 1423 - 1426
  • [16] RANKING WITH QUERY INFLUENCE WEIGHTING FOR DOCUMENT RETRIEVAL
    Liao, Zhen
    Huang, Ya Lou
    Xie, Mao Qiang
    Liu, Jie
    Wang, Yang
    Lui, Min
    [J]. PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 1177 - +
  • [17] Semi-supervised ranking for document retrieval
    Duh, Kevin
    Kirchhoff, Katrin
    [J]. COMPUTER SPEECH AND LANGUAGE, 2011, 25 (02): : 261 - 281
  • [18] The Impact of Document Level Ranking on Focused Retrieval
    Kamps, Jaap
    Koolen, Marijn
    [J]. ADVANCES IN FOCUSED RETRIEVAL, 2009, 5631 : 140 - 151
  • [19] Coeus: A System for Oblivious Document Ranking and Retrieval
    Ahmad, Ishtiyaque
    Sarker, Laboni
    Agrawal, Divyakant
    El Abbadi, Amr
    Gupta, Trinabh
    [J]. PROCEEDINGS OF THE 28TH ACM SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES, SOSP 2021, 2021, : 672 - 690
  • [20] The Power of Selecting Key Blocks with Local Pre-ranking for Long Document Information Retrieval
    Li, Minghan
    Popa, Diana Nicoleta
    Chagnon, Johan
    Cinar, Yagmur Gizem
    Gaussier, Eric
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2023, 41 (03)