Diversity-Aware Top-k Publish/Subscribe for Text Stream

被引:35
|
作者
Chen, Lisi [1 ]
Cong, Gao [1 ]
机构
[1] Nanyang Technol Univ, Sch Comp Engn, Singapore, Singapore
关键词
text stream; diversification; publish/subscribe; SEARCH;
D O I
10.1145/2723372.2749451
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Massive amount of text data are being generated by a huge number of web users at an unprecedented scale. These data cover a wide range of topics. Users are interested in receiving a few up-to-date representative documents (e.g., tweets) that can provide them with a wide coverage of different aspects of their query topics. To address the problem, we consider the Diversity-Aware Top k Subscription (DAS) query. Given a DAS query, we continuously maintain an up-to-date result set that contains k most recently returned documents over a text stream for the query. The DAS query takes into account text relevance, document recency, and result diversity. We propose a novel solution to efficiently processing a large number of DAS queries over a stream of documents. We demonstrate the efficiency of our approach on real world dataset and the experimental results show that our solution is able to achieve a reduction of the processing time by 60-75% compared with two baselines. We also study the effectiveness of the DAS query.
引用
收藏
页码:347 / 362
页数:16
相关论文
共 50 条
  • [41] Indoor Top-k Keyword-aware Routing Query
    Feng, Zijin
    Liu, Tiantian
    Li, Huan
    Lu, Hua
    Shou, Lidan
    Xu, Jianliang
    [J]. 2020 IEEE 36TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2020), 2020, : 1213 - 1224
  • [42] Social-Aware Spatial Top-k and Skyline Queries
    Sohail, Ammar
    Cheema, Muhammad Aamir
    Taniar, David
    [J]. COMPUTER JOURNAL, 2018, 61 (11): : 1620 - 1638
  • [43] Distributed top-k full-text content dissemination
    Rao, Weixiong
    Chen, Lei
    [J]. DISTRIBUTED AND PARALLEL DATABASES, 2012, 30 (3-4) : 273 - 301
  • [44] Distributed top-k full-text content dissemination
    Weixiong Rao
    Lei Chen
    [J]. Distributed and Parallel Databases, 2012, 30 : 273 - 301
  • [45] Evaluating Top-K Approximate Patterns via Text Clustering
    Lucchese, Claudio
    Orlando, Salvatore
    Perego, Raffaele
    [J]. BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY, DAWAK 2016, 2016, 9829 : 114 - 127
  • [46] Top-k Ranked Document Search in General Text Databases
    Culpepper, J. Shane
    Navarro, Gonzalo
    Puglisi, Simon J.
    Turpin, Andrew
    [J]. ALGORITHMS-ESA 2010, PT II, 2010, 6347 : 194 - +
  • [47] From text to talk: Harnessing conversational corpora for humane and diversity-aware language technology
    Dingemanse, Mark
    Liesenfeld, Andreas
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 5614 - 5633
  • [48] Finding Top-k Shortest Paths with Diversity (Extended Abstract)
    Liu, Huiping
    Jin, Cheqing
    Yang, Bin
    Zhou, Aoying
    [J]. 2018 IEEE 34TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2018, : 1761 - 1762
  • [49] Social-aware spatial keyword top-k group query
    Xiangguo Zhao
    Zhen Zhang
    Hong Huang
    Xin Bi
    [J]. Distributed and Parallel Databases, 2020, 38 : 601 - 623
  • [50] Data-Aware Top-k Monitoring in Wireless Sensor Networks
    Yeo, Myungho
    Seong, Dongook
    Yoo, Jaesoo
    [J]. RWS: 2009 IEEE RADIO AND WIRELESS SYMPOSIUM, 2009, : 95 - 98