Join Queries on Uncertain Data: Semantics and Efficient Processing

被引:0
|
作者
Ge, Tingjian [1 ]
机构
[1] Univ Kentucky, Dept Comp Sci, Lexington, KY 40506 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Uncertain data is quite common nowadays in a variety of modern database applications. At the same time, the join operation is one of the most important but expensive operations in SQL. However, join queries on uncertain data have not been adequately addressed thus far. In this paper, we study the SQL join operation on uncertain attributes. We observe and formalize two kinds of join operations on such data, namely v-join and d-join. They are each useful for different applications. Using probability theory, we then devise efficient query processing algorithms for these join operations. Specifically, we use probability bounds that are based on the moments of random variables to either early accept or early reject a candidate v-join result tuple. We also devise an indexing mechanism and an algorithm called Two-End Zigzag Join to further save I/O costs. For d-join, we first observe that it can be reduced to a special form of similarity join in a multidimensional space. We then design an efficient algorithm called condensed d-join and an optimal condensation scheme based on dynamic programming. Finally, we perform a comprehensive empirical study using both real datasets and synthetic datasets.
引用
收藏
页码:697 / 708
页数:12
相关论文
共 50 条
  • [31] An Efficient Processing of Join Queries for Sensor Networks Using Column-Oriented Databases
    Kim, Kyung-Chang
    [J]. INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2013,
  • [32] Efficient Parallel Processing of Analytical Queries on Linked Data
    Hagedorn, Stefan
    Sattler, Kai-Uwe
    [J]. ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS: OTM 2013 CONFERENCES, 2013, 8185 : 452 - 469
  • [33] Efficient Processing of Queries over Recursive XML Data
    Alghamdi, Norah Saleh
    Rahayu, Wenny
    Pardede, Eric
    [J]. 2015 IEEE 29th International Conference on Advanced Information Networking and Applications (IEEE AINA 2015), 2015, : 134 - 142
  • [34] Efficient integrity checks for join queries in the cloud
    di Vimercati, Sabrina De Capitani
    Foresti, Sara
    Jajodia, Sushil
    Paraboschi, Stefano
    Samarati, Pierangela
    [J]. JOURNAL OF COMPUTER SECURITY, 2016, 24 (03) : 347 - 378
  • [35] Efficient Searchable Symmetric Encryption for Join Queries
    Jutla, Charanjit
    Patranabis, Sikhar
    [J]. ADVANCES IN CRYPTOLOGY-ASIACRYPT 2022, PT III, 2022, 13793 : 304 - 333
  • [36] Efficient Range Query Processing on Uncertain Data
    Knight, Andrew
    Yu, Qi
    Rege, Manjeet
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IRI), 2011, : 263 - 268
  • [37] An efficient algorithm for top-k queries on uncertain data streams
    Dai, Caiyan
    Chen, Ling
    Chen, Yixin
    Tang, Keming
    [J]. 2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 1, 2012, : 294 - 299
  • [38] An efficient scheme for probabilistic skyline queries over distributed uncertain data
    Xiaoyong Li
    Yijie Wang
    Jie Yu
    [J]. Telecommunication Systems, 2015, 60 : 225 - 237
  • [39] An efficient scheme for probabilistic skyline queries over distributed uncertain data
    Li, Xiaoyong
    Wang, Yijie
    Yu, Jie
    [J]. TELECOMMUNICATION SYSTEMS, 2015, 60 (02) : 225 - 237
  • [40] Efficient and Progressive Algorithms for Distributed Skyline Queries over Uncertain Data
    Ding, Xiaofeng
    Jin, Hai
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (08) : 1448 - 1462