Diversity-Aware Top-k Publish/Subscribe for Text Stream

被引:35
|
作者
Chen, Lisi [1 ]
Cong, Gao [1 ]
机构
[1] Nanyang Technol Univ, Sch Comp Engn, Singapore, Singapore
关键词
text stream; diversification; publish/subscribe; SEARCH;
D O I
10.1145/2723372.2749451
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Massive amount of text data are being generated by a huge number of web users at an unprecedented scale. These data cover a wide range of topics. Users are interested in receiving a few up-to-date representative documents (e.g., tweets) that can provide them with a wide coverage of different aspects of their query topics. To address the problem, we consider the Diversity-Aware Top k Subscription (DAS) query. Given a DAS query, we continuously maintain an up-to-date result set that contains k most recently returned documents over a text stream for the query. The DAS query takes into account text relevance, document recency, and result diversity. We propose a novel solution to efficiently processing a large number of DAS queries over a stream of documents. We demonstrate the efficiency of our approach on real world dataset and the experimental results show that our solution is able to achieve a reduction of the processing time by 60-75% compared with two baselines. We also study the effectiveness of the DAS query.
引用
收藏
页码:347 / 362
页数:16
相关论文
共 50 条
  • [1] Location-Aware Top-k Term Publish/Subscribe
    Chen, Lisi
    Shang, Shuo
    Zhang, Zhiwei
    Cao, Xin
    Jensen, Christian S.
    Kalnis, Panos
    [J]. 2018 IEEE 34TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2018, : 749 - 760
  • [2] Top-k/w publish/subscribe: A publish/subscribe model for continuous top-k processing over data streams
    Pripuzic, Kresimir
    Zarko, Ivana Podnar
    Aberer, Karl
    [J]. INFORMATION SYSTEMS, 2014, 39 : 256 - 276
  • [3] Top-k Publish/Subscribe for Ride Hitching
    Li, Yafei
    Gu, Hongyan
    Chen, Rui
    Xu, Jianliang
    Xu, Mingliang
    [J]. 2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, : 2291 - 2296
  • [4] Efficient Top-k Subscription Matching for Location-Aware Publish/Subscribe
    Hu, Jiafeng
    Cheng, Reynold
    Wu, Dingming
    Jin, Beihong
    [J]. ADVANCES IN SPATIAL AND TEMPORAL DATABASES (SSTD 2015), 2015, 9239 : 333 - 351
  • [5] Efficient Top-k Matching for Publish/Subscribe Ride Hitching
    Li, Yafei
    Gu, Hongyan
    Chen, Rui
    Xu, Jianliang
    Guo, Shangwei
    Xue, Junxiao
    Xu, Mingliang
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (04) : 3808 - 3821
  • [6] Approximate spatio-temporal top-k publish/subscribe
    Chen, Lisi
    Shang, Shuo
    [J]. WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2019, 22 (05): : 2153 - 2175
  • [7] Top-k Publish-Subscribe for Social Annotation of News
    Shraer, Alexander
    Gurevich, Maxim
    Fontoura, Marcus
    Josifovski, Vanja
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2013, 6 (06): : 385 - 396
  • [8] Approximate spatio-temporal top-k publish/subscribe
    Lisi Chen
    Shuo Shang
    [J]. World Wide Web, 2019, 22 : 2153 - 2175
  • [9] Temporal Spatial-Keyword Top-k Publish/Subscribe
    Chen, Lisi
    Cong, Gao
    Cao, Xin
    Tan, Kian-Lee
    [J]. 2015 IEEE 31ST INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2015, : 255 - 266
  • [10] Relevance Matters: Capitalizing on Less Top-k Matching in Publish/Subscribe
    Sadoghi, Mohammad
    Jacobsen, Hans-Arno
    [J]. 2012 IEEE 28TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2012, : 786 - 797