Query expansion based on term time distribution for microblog retrieval

被引:0
|
作者
机构
[1] [1,Han, Zhong-Yuan
[2] Yang, Mu-Yun
[3] Kong, Lei-Lei
[4] Qi, Hao-Liang
[5] Li, Sheng
来源
Yang, Mu-Yun (ymy@mtlab.hit.edu.cn) | 1600年 / Science Press卷 / 39期
基金
中国国家自然科学基金;
关键词
Frequency estimation;
D O I
10.11897/SP.J.1016.2016.02031
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In microblog retrieval, content-based query expansion methods are not adequate for expanding queries since the relevant microblog messages are too short to provide reliable term distribution information. Most of the existing time-based query expansion methods exploit time profile to shift the prior probability of relevant microblogs. In essence, these methods still could not avoid the restrictions of short texts since the relevance between expansion terms and query is still based on the content of microblogs. To address the problem, this paper proposes a query expansion method based on the time distribution of terms, in which the relevance between query terms and expansion terms is measured by their time distribution similarity. First, the changes of term frequency in different time segments are analyzed, the term time distribution is defined and the estimation methods are illustrated. Then a similarity estimation approach of term time distribution is presented to estimate the relevance of query terms and expansion terms, so as to decide the expansion terms in the re-estimated query model. Two query expansion strategies are given to estimate the query expansion model according to the relevance of expansion terms and query. Finally, by integrating the query expansion model and original query model, the term time distribution query model is presented. The effort to use only time profile to establish the relevance between query terms and expansion terms avoids the drawbacks of the classical content-based query expansion approaches due to the length limit in microblog. Experiments were carried on TREC 2011 and TREC 2012 microblog retrieval collection. Several state-of-the-art baselines are chosen for comparing with our method, including the classical language model, the content-based query expansion method and the time-based query expansion method. The experimental results show that the term time distribution query model outperforms the content-based as well as the time-based approaches. © 2016, Science Press. All right reserved.
引用
收藏
相关论文
共 50 条
  • [41] Parallel information retrieval with query expansion
    Chung, Y
    [J]. APPLIED PARALLEL COMPUTING: ADVANCED SCIENTIFIC COMPUTING, 2002, 2367 : 195 - 202
  • [42] Retrieval efficiency of normalized query expansion
    Stamou, S
    Christodoulakis, D
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2005, 3406 : 604 - 607
  • [43] Parallel information retrieval with query expansion
    Chung, Y
    [J]. APPLIED PARALLEL COMPUTING: ADVANCED SCIENTIFIC COMPUTING, 2002, 2367 : 195 - 202
  • [44] THE LIMITATIONS OF TERM COOCCURRENCE DATA FOR QUERY EXPANSION IN DOCUMENT-RETRIEVAL SYSTEMS
    PEAT, HJ
    WILLETT, P
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1991, 42 (05): : 378 - 383
  • [45] Time segment language model for microblog retrieval
    Han, Zhong-yuan
    Kong, Lei-lei
    Qi, Hao-liang
    [J]. NEURAL COMPUTING & APPLICATIONS, 2021, 33 (10): : 4763 - 4777
  • [46] Time segment language model for microblog retrieval
    Zhong-yuan Han
    Lei-lei Kong
    Hao-liang Qi
    [J]. Neural Computing and Applications, 2021, 33 : 4763 - 4777
  • [47] A Time-Sensitive Model for Microblog Retrieval
    Shi, Cunhui
    Xu, Bo
    Lin, Hongfei
    Guo, Qing
    [J]. NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2013, 2013, 400 : 402 - 409
  • [48] Chinese query expansion based on related term group
    He, TT
    Tu, XH
    Qu, GZ
    Ji, DH
    [J]. Proceedings of the 2005 IEEE International Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE'05), 2005, : 483 - 487
  • [49] Query expansion based on term similarity tree model
    Jin, QL
    Zhao, J
    Xu, B
    [J]. 2003 INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, PROCEEDINGS, 2003, : 400 - 406
  • [50] TAG term weight-based N gram Thesaurus generation for query expansion in information retrieval application
    Shaila, S. G.
    Vadivel, A.
    [J]. JOURNAL OF INFORMATION SCIENCE, 2015, 41 (04) : 467 - 485