Efficient Evaluation of Continuous Text Search Queries

被引:11
|
作者
Mouratidis, Kyriakos [1 ]
Pang, HweeHwa [1 ]
机构
[1] Singapore Management Univ, Sch Informat Syst, Singapore 178902, Singapore
关键词
Continuous queries; document streams; text filtering; K SELECTION QUERIES; MAINTENANCE; STRATEGIES;
D O I
10.1109/TKDE.2011.125
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Consider a text filtering server that monitors a stream of incoming documents for a set of users, who register their interests in the form of continuous text search queries. The task of the server is to constantly maintain for each query a ranked result list, comprising the recent documents (drawn from a sliding window) with the highest similarity to the query. Such a system underlies many text monitoring applications that need to cope with heavy document traffic, such as news and email monitoring. In this paper, we propose the first solution for processing continuous text queries efficiently. Our objective is to support a large number of user queries while sustaining high document arrival rates. Our solution indexes the streamed documents in main memory with a structure based on the principles of the inverted file, and processes document arrival and expiration events with an incremental threshold-based method. We distinguish between two versions of the monitoring algorithm, an eager and a lazy one, which differ in how aggressively they manage the thresholds on the inverted index. Using benchmark queries over a stream of real documents, we experimentally verify the efficiency of our methodology; both its versions are at least an order of magnitude faster than a competitor constructed from existing techniques, with lazy being the best approach overall.
引用
收藏
页码:1469 / 1482
页数:14
相关论文
共 50 条
  • [21] On the Efficient Processing of Multilevel Secure Continuous Queries
    Xie, Xing
    Ray, Indrakshi
    Adaikkalavan, Raman
    2013 ASE/IEEE INTERNATIONAL CONFERENCE ON SOCIAL COMPUTING (SOCIALCOM), 2013, : 417 - 422
  • [22] Efficient and adaptive processing of multiple continuous queries
    Tok, WH
    Bressan, S
    ADVANCES IN DATABASE TECHNOLOGY - EDBT 2002, 2002, 2287 : 215 - 232
  • [23] Efficient evaluation of continuous spatio-temporal queries on moving objects with uncertain velocity
    Yuan-Ko Huang
    Chiang Lee
    GeoInformatica, 2010, 14 : 163 - 200
  • [24] Efficient evaluation of continuous spatio-temporal queries on moving objects with uncertain velocity
    Huang, Yuan-Ko
    Lee, Chiang
    GEOINFORMATICA, 2010, 14 (02) : 163 - 200
  • [25] Continuous Similarity Search for Text Sets
    Tsuchida, Yuma
    Kubo, Kohei
    Koga, Hisashi
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2022, PT II, 2022, 13427 : 229 - 234
  • [26] Efficient evaluation of queries with mining predicates
    Chaudhuri, S
    Narasayya, V
    Sarawagi, S
    18TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2002, : 529 - 540
  • [27] EFFICIENT EVALUATION FOR A SUBSET OF RECURSIVE QUERIES
    GRAHNE, G
    SIPPU, S
    SOISALONSOININEN, E
    JOURNAL OF LOGIC PROGRAMMING, 1991, 10 (3-4): : 301 - 332
  • [28] Efficient evaluation of XML twig queries
    Chang, YH
    Lee, CT
    Luo, CC
    ADVANCED WEB AND NETWORK TECHNOLOGIES, AND APPLICATIONS, PROCEEDINGS, 2006, 3842 : 48 - 57
  • [29] Efficient Evaluation of Multiple Preference Queries
    Hou, Leong U.
    Marnoulis, Nikos
    Mouratidis, Kyriakos
    ICDE: 2009 IEEE 25TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2009, : 1251 - +
  • [30] Efficient Combined Text and Spatial Search
    Han, Amber
    Nickerson, Bradford G.
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2015, PT III, 2015, 9157 : 713 - 728