Join Queries on Uncertain Data: Semantics and Efficient Processing

被引:0
|
作者
Ge, Tingjian [1 ]
机构
[1] Univ Kentucky, Dept Comp Sci, Lexington, KY 40506 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Uncertain data is quite common nowadays in a variety of modern database applications. At the same time, the join operation is one of the most important but expensive operations in SQL. However, join queries on uncertain data have not been adequately addressed thus far. In this paper, we study the SQL join operation on uncertain attributes. We observe and formalize two kinds of join operations on such data, namely v-join and d-join. They are each useful for different applications. Using probability theory, we then devise efficient query processing algorithms for these join operations. Specifically, we use probability bounds that are based on the moments of random variables to either early accept or early reject a candidate v-join result tuple. We also devise an indexing mechanism and an algorithm called Two-End Zigzag Join to further save I/O costs. For d-join, we first observe that it can be reduced to a special form of similarity join in a multidimensional space. We then design an efficient algorithm called condensed d-join and an optimal condensation scheme based on dynamic programming. Finally, we perform a comprehensive empirical study using both real datasets and synthetic datasets.
引用
收藏
页码:697 / 708
页数:12
相关论文
共 50 条
  • [21] Distributed approach of continuous queries with KNN join processing in spatial data warehouse
    Gorawski, Marcin
    Gebczyk, Wojciech
    [J]. ICEIS 2007: PROCEEDINGS OF THE NINTH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS: DATABASES AND INFORMATION SYSTEMS INTEGRATION, 2007, : 131 - 136
  • [22] A Comparison of Distributed Spatial Data Management Systems for Processing Distance Join Queries
    Garcia-Garcia, Francisco
    Corral, Antonio
    Iribarne, Luis
    Mavrommatis, George
    Vassilakopoulos, Michael
    [J]. ADVANCES IN DATABASES AND INFORMATION SYSTEMS, ADBIS 2017, 2017, 10509 : 214 - 228
  • [23] Processing sliding window join aggregate in continuous queries over data streams
    Wang, WP
    Li, JZ
    Zhang, DD
    Guo, LJ
    [J]. ADVANCES IN DATABASES AND INFORMATION SYSTEMS, PROCEEDINGS, 2004, 3255 : 348 - 363
  • [24] Realization of continuous queries with NN join processing in spatial telemetric data warehouse
    Gorawski, Marcin
    Gebczyk, Wojciech
    [J]. SEVENTEENTH INTERNATIONAL CONFERENCE ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2006, : 632 - +
  • [25] Fast Processing of Join Queries with Instant Response
    Hamdi, Mohammed
    Yu, Feng
    Hou, Wen-Chi
    [J]. 2017 COMPUTING CONFERENCE, 2017, : 352 - 362
  • [26] Efficient Join Processing Over Incomplete Data Streams
    Ren, Weilong
    Lian, Xiang
    Ghazinour, Kambiz
    [J]. PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, : 209 - 218
  • [27] Secure mediation of join queries by processing ciphertexts
    Biskup, Joachim
    Tsatedem, Christian
    Wiese, Lena
    [J]. 2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOP, VOLS 1-2, 2007, : 715 - 724
  • [28] Efficient Sliding Window Join in Data Stream Processing
    Kim, Hyeon Gyu
    [J]. ADVANCED MULTIMEDIA AND UBIQUITOUS ENGINEERING: FUTURE INFORMATION TECHNOLOGY, VOL 2, 2016, 354 : 375 - 381
  • [29] Processing Top-k Join Queries
    Wu, Minji
    Berti-Equille, Laure
    Marian, Amelie
    Procopiuc, Cecilia M.
    Srivastava, Divesh
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2010, 3 (01): : 860 - 870
  • [30] Adaptive and incremental processing for distance join queries
    Shin, H
    Moon, B
    Lee, S
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2003, 15 (06) : 1561 - 1578