Search Result Diversification in Short Text Streams

被引:13
|
作者
Liang, Shangsong [1 ]
Yilmaz, Emine [1 ,2 ]
Shen, Hong [3 ,4 ]
De Rijke, Maarten [5 ]
Croft, W. Bruce [6 ]
机构
[1] UCL, Dept Comp Sci, London, England
[2] Alan Turing Inst, London, England
[3] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangzhou, Guangdong, Peoples R China
[4] Univ Adelaide, Dept Comp Sci, Adelaide, SA, Australia
[5] Univ Amsterdam, Informat Inst, Amsterdam, Netherlands
[6] Univ Massachusetts, Coll Informat & Comp Sci, Amherst, MA 01003 USA
关键词
Diversity; ad hoc retrieval; data streams;
D O I
10.1145/3057282
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider the problem of search result diversification for streams of short texts. Diversifying search results in short text streams is more challenging than in the case of long documents, as it is difficult to capture the latent topics of short documents. To capture the changes of topics and the probabilities of documents for a given query at a specific time in a short text stream, we propose a dynamic Dirichlet multinomial mixture topic model, called D2M3, as well as a Gibbs sampling algorithm for the inference. We also propose a streaming diversification algorithm, SDA, that integrates the information captured by D2M3 with our proposed modified version of the PM-2 (Proportionality-based diversification Method second version) diversification algorithm. We conduct experiments on a Twitter dataset and find that SDA statistically significantly outperforms state-of-the-art non-streaming retrieval methods, plain streaming retrieval methods, as well as streaming diversification methods that use other dynamic topic models.
引用
收藏
页数:35
相关论文
共 50 条
  • [41] Performance Evaluation of Search Result Diversification Methods and Their Stability
    Xu, Chunlin
    Chen, Tingting
    Wu, Shengli
    2016 3RD INTERNATIONAL CONFERENCE ON SYSTEMS AND INFORMATICS (ICSAI), 2016, : 721 - 726
  • [42] On the Additivity and Weak Baselines for Search Result Diversification Research
    Akcay, Mehmet
    Altingovde, Ismail Sengor
    Macdonald, Craig
    Ounis, Iadh
    ICTIR'17: PROCEEDINGS OF THE 2017 ACM SIGIR INTERNATIONAL CONFERENCE THEORY OF INFORMATION RETRIEVAL, 2017, : 109 - 116
  • [43] A Survival Modeling Approach to Biomedical Search Result Diversification
    Yin, Xiaoshi
    Huang, Jimmy Xiangji
    Zhou, Xiaofeng
    Li, Zhoujun
    SIGIR 2010: PROCEEDINGS OF THE 33RD ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH DEVELOPMENT IN INFORMATION RETRIEVAL, 2010, : 901 - 902
  • [44] Search Result Diversification Using Query Aspects as Bottlenecks
    Yu, Puxuan
    Rahimi, Razieh
    Huang, Zhiqi
    Allan, James
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 3040 - 3051
  • [45] A keyword based prototype for web search result diversification
    Lin, G.-L., 1600, Institute of Information Science (28):
  • [46] A Keyword Based Prototype for Web Search Result Diversification
    Lin, Gu-Li
    Peng, Hong
    Ma, Qian-Li
    Wei, Jia
    Qin, Jiang-Wei
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2012, 28 (03) : 601 - 615
  • [47] Continuous Similarity Search for Dynamic Text Streams
    Tsuchida, Yuma
    Kubo, Kohei
    Koga, Hisashi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2023, E106D (12) : 2026 - 2035
  • [48] Descriptive Text Generation for Image Search Result
    Liu, Rui
    Jiang, Minghu
    ACHIEVEMENTS IN ENGINEERING MATERIALS, ENERGY, MANAGEMENT AND CONTROL BASED ON INFORMATION TECHNOLOGY, PTS 1 AND 2, 2011, 171-172 : 94 - 97
  • [49] Collaborative User Clustering for Short Text Streams
    Liang, Shangsong
    Ren, Zhaochun
    Yilmaz, Emine
    Kanoulas, Evangelos
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3504 - 3510
  • [50] Explainable User Clustering in Short Text Streams
    Zhao, Yukun
    Liang, Shangsong
    Ren, Zhaochun
    Ma, Jun
    Yilmaz, Emine
    de Rijke, Maarten
    SIGIR'16: PROCEEDINGS OF THE 39TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2016, : 155 - 164