Join Queries on Uncertain Data: Semantics and Efficient Processing

被引:0
|
作者
Ge, Tingjian [1 ]
机构
[1] Univ Kentucky, Dept Comp Sci, Lexington, KY 40506 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Uncertain data is quite common nowadays in a variety of modern database applications. At the same time, the join operation is one of the most important but expensive operations in SQL. However, join queries on uncertain data have not been adequately addressed thus far. In this paper, we study the SQL join operation on uncertain attributes. We observe and formalize two kinds of join operations on such data, namely v-join and d-join. They are each useful for different applications. Using probability theory, we then devise efficient query processing algorithms for these join operations. Specifically, we use probability bounds that are based on the moments of random variables to either early accept or early reject a candidate v-join result tuple. We also devise an indexing mechanism and an algorithm called Two-End Zigzag Join to further save I/O costs. For d-join, we first observe that it can be reduced to a special form of similarity join in a multidimensional space. We then design an efficient algorithm called condensed d-join and an optimal condensation scheme based on dynamic programming. Finally, we perform a comprehensive empirical study using both real datasets and synthetic datasets.
引用
收藏
页码:697 / 708
页数:12
相关论文
共 50 条
  • [1] SPHLU:An Efficient Algorithm for Processing PRkNN Queries on Uncertain Data
    WANG Shengsheng
    LI Yang
    CHAI Sheng
    BOLOU Bolou Dickson
    [J]. Chinese Journal of Electronics, 2016, 25 (03) : 403 - 406
  • [2] SPHLU: An Efficient Algorithm for Processing PRkNN Queries on Uncertain Data
    Wang Shengsheng
    Li Yang
    Chai Sheng
    Bolou, Bolou Dickson
    [J]. CHINESE JOURNAL OF ELECTRONICS, 2016, 25 (03) : 403 - 406
  • [3] Efficient processing of multiple structural join queries
    Subramanyam, GV
    Kumar, PS
    [J]. KEY TECHNOLOGIES FOR DATA MANAGEMENT, 2004, 3112 : 112 - 123
  • [4] Efficient Processing of Skyline-Join Queries over Multiple Data Sources
    Nagendra, Mithila
    Candan, K. Selcuk
    [J]. ACM TRANSACTIONS ON DATABASE SYSTEMS, 2015, 40 (02):
  • [5] PI-Join: Efficiently processing join queries on massive data
    Xixian Han
    Jianzhong Li
    Donghua Yang
    [J]. Knowledge and Information Systems, 2012, 32 : 527 - 557
  • [6] PI-Join: Efficiently processing join queries on massive data
    Han, Xixian
    Li, Jianzhong
    Yang, Donghua
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2012, 32 (03) : 527 - 557
  • [7] Efficient processing of probabilistic reverse nearest neighbor queries over uncertain data
    Lian, Xiang
    Chen, Lei
    [J]. VLDB JOURNAL, 2009, 18 (03): : 787 - 808
  • [8] Efficient processing of probabilistic reverse nearest neighbor queries over uncertain data
    Xiang Lian
    Lei Chen
    [J]. The VLDB Journal, 2009, 18 : 787 - 808
  • [9] Similarity Join Processing on Uncertain Data Streams
    Lian, Xiang
    Chen, Lei
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2011, 23 (11) : 1718 - 1734
  • [10] Uncertain Data Queries Processing in a Probabilistic Framework
    He, Ming
    Du, Yong-ping
    [J]. JOURNAL OF COMPUTERS, 2010, 5 (11) : 1663 - 1669