Topic Models Ensembles for AD-HOC Information Retrieval

被引:2
|
作者
Ormeno, Pablo [1 ]
Mendoza, Marcelo [1 ]
Valle, Carlos [2 ]
机构
[1] Univ Tecn Federico Santa Maria, Dept Informat, Valparaiso 2340000, Chile
[2] Univ Playa Ancha Ciencias Educ, Dept Informat, Valparaiso 2340000, Chile
关键词
ad hoc information retrieval; Latent Dirichlet Allocation (LDA); Bagging; boosting;
D O I
10.3390/info12090360
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Ad hoc information retrieval (ad hoc IR) is a challenging task consisting of ranking text documents for bag-of-words (BOW) queries. Classic approaches based on query and document text vectors use term-weighting functions to rank the documents. Some of these methods' limitations consist of their inability to work with polysemic concepts. In addition, these methods introduce fake orthogonalities between semantically related words. To address these limitations, model-based IR approaches based on topics have been explored. Specifically, topic models based on Latent Dirichlet Allocation (LDA) allow building representations of text documents in the latent space of topics, the better modeling of polysemy and avoiding the generation of orthogonal representations between related terms. We extend LDA-based IR strategies using different ensemble strategies. Model selection obeys the ensemble learning paradigm, for which we test two successful approaches widely used in supervised learning. We study Boosting and Bagging techniques for topic models, using each model as a weak IR expert. Then, we merge the ranking lists obtained from each model using a simple but effective top-k list fusion approach. We show that our proposal strengthens the results in precision and recall, outperforming classic IR models and strong baselines based on topic models.
引用
下载
收藏
页数:17
相关论文
共 50 条
  • [21] A Neural Passage Model for Ad-hoc Document Retrieval
    Ai, Qingyao
    O'Connor, Brendan
    Croft, W. Bruce
    ADVANCES IN INFORMATION RETRIEVAL (ECIR 2018), 2018, 10772 : 537 - 543
  • [22] A Deep Relevance Matching Model for Ad-hoc Retrieval
    Guo, Jiafeng
    Fan, Yixing
    Ai, Qingyao
    Croft, W. Bruce
    CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, : 55 - 64
  • [23] Implicit Entity Linking through Ad-hoc Retrieval
    Hosseini, Hawre
    Nguyen, Tam T.
    Bagheri, Ebrahim
    2018 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM), 2018, : 326 - 329
  • [24] Information Hovering in Vehicular Ad-Hoc Networks
    Xeros, Andreas
    Lestas, Marios
    Andreou, Maria
    Pitsillides, Andreas
    Ioannou, Petros
    2009 IEEE GLOBECOM WORKSHOPS, 2009, : 380 - +
  • [25] Information gathering in ad-hoc radio networks
    Chrobak, Marek
    Costello, Kevin P.
    Gasieniec, Leszek
    INFORMATION AND COMPUTATION, 2021, 281
  • [26] Estimation of Statistical Translation Models Based on Mutual Information for Ad Hoc Information Retrieval
    Karimzadehgan, Maryam
    Zhai, ChengXiang
    SIGIR 2010: PROCEEDINGS OF THE 33RD ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH DEVELOPMENT IN INFORMATION RETRIEVAL, 2010, : 323 - 330
  • [27] Reconsidering Attacker Models in Ad-Hoc Networks
    Ostadal, Radim
    Svenda, Petr
    Matyas, Vashek
    SECURITY PROTOCOLS XXIV, 2017, 10368 : 219 - 227
  • [28] Smart Coordination of Autonomic Component Ensembles in the Context of Ad-Hoc Communication
    Bures, Tomas
    Hnetynka, Petr
    Krijt, Filip
    Matena, Vladimir
    Plasil, Frantisek
    LEVERAGING APPLICATIONS OF FORMAL METHODS, VERIFICATION AND VALIDATION: FOUNDATIONAL TECHNIQUES, PT I, 2016, 9952 : 642 - 656
  • [29] An analysis of evaluation campaigns in ad-hoc medical information retrieval: CLEF eHealth 2013 and 2014
    Lorraine Goeuriot
    Gareth J. F. Jones
    Liadh Kelly
    Johannes Leveling
    Mihai Lupu
    Joao Palotti
    Guido Zuccon
    Information Retrieval Journal, 2018, 21 : 507 - 540
  • [30] An analysis of evaluation campaigns in ad-hoc medical information retrieval: CLEF eHealth 2013 and 2014
    Goeuriot, Lorraine
    Jones, Gareth J. F.
    Kelly, Liadh
    Leveling, Johannes
    Lupu, Mihai
    Palotti, Joao
    Zuccon, Guido
    INFORMATION RETRIEVAL JOURNAL, 2018, 21 (06): : 507 - 540