Similarity Join Processing on Uncertain Data Streams

被引:14
|
作者
Lian, Xiang [1 ]
Chen, Lei [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Kowloon, Hong Kong, Peoples R China
关键词
Join on uncertain data streams; adaptive superset prejoin;
D O I
10.1109/TKDE.2010.208
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Similarity join processing in the streaming environment has many practical applications such as sensor networks, object tracking and monitoring, and so on. Previous works usually assume that stream processing is conducted over precise data. In this paper, we study an important problem of similarity join processing on stream data that inherently contain uncertainty (or called uncertain data streams), where the incoming data at each time stamp are uncertain and imprecise. Specifically, we formalize this problem as join on uncertain data streams (USJ), which can guarantee the accuracy of USJ answers over uncertain data. To tackle the challenges with respect to efficiency and effectiveness such as limited memory and small response time, we propose effective pruning methods on both object and sample levels to filter out false alarms. We integrate the proposed pruning methods into an efficient query procedure that can incrementally maintain the USJ answers. Most importantly, we further design a novel strategy, namely, adaptive superset prejoin (ASP), to maintain a superset of USJ candidate pairs. ASP is in light of our proposed formal cost model such that the average USJ processing cost is minimized. We have conducted extensive experiments to demonstrate the efficiency and effectiveness of our proposed approaches.
引用
收藏
页码:1718 / 1734
页数:17
相关论文
共 50 条
  • [41] Continuous Outlier Monitoring on Uncertain Data Streams
    曹科研
    王国仁
    韩东红
    丁国辉
    王爱侠
    石凌旭
    [J]. Journal of Computer Science & Technology, 2014, 29 (03) : 436 - 448
  • [42] Continuous Outlier Monitoring on Uncertain Data Streams
    Cao, Ke-Yan
    Wang, Guo-Ren
    Han, Dong-Hong
    Ding, Guo-Hui
    Wang, Ai-Xia
    Shi, Ling-Xu
    [J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2014, 29 (03) : 436 - 448
  • [43] Sliding windows over uncertain data streams
    Michele Dallachiesa
    Gabriela Jacques-Silva
    Buğra Gedik
    Kun-Lung Wu
    Themis Palpanas
    [J]. Knowledge and Information Systems, 2015, 45 : 159 - 190
  • [44] Sliding windows over uncertain data streams
    Dallachiesa, Michele
    Jacques-Silva, Gabriela
    Gedik, Bugra
    Wu, Kun-Lung
    Palpanas, Themis
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2015, 45 (01) : 159 - 190
  • [45] Twig'n join: Progressive query processing of multiple XML streams
    Tok, Wee Hyong
    Bressan, Stephane
    Lee, Mong-Li
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 2008, 4947 : 546 - 553
  • [46] Continuous Outlier Monitoring on Uncertain Data Streams
    Ke-Yan Cao
    Guo-Ren Wang
    Dong-Hong Han
    Guo-Hui Ding
    Ai-Xia Wang
    Ling-Xu Shi
    [J]. Journal of Computer Science and Technology, 2014, 29 : 436 - 448
  • [47] SJCBMQ: A novel spatial join-based algorithm for continuous border monitoring query processing in data streams
    Zhang, Yunyi
    Huang, Chongzheng
    Zhang, Deyun
    [J]. 2007 2ND INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND APPLICATIONS, VOLS 1 AND 2, 2007, : 291 - +
  • [48] Discovery of Cross-Similarity in Data Streams
    Toyoda, Machiko
    Sakurai, Yasushi
    [J]. 26TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING ICDE 2010, 2010, : 101 - 104
  • [49] PI-Join: Efficiently processing join queries on massive data
    Xixian Han
    Jianzhong Li
    Donghua Yang
    [J]. Knowledge and Information Systems, 2012, 32 : 527 - 557
  • [50] Time-slide window join over data streams
    Hyeon Gyu Kim
    Yoo Hyun Park
    Yang Hyun Cho
    Myoung Ho Kim
    [J]. Journal of Intelligent Information Systems, 2014, 43 : 323 - 347