Size Bounds and Query Plans for Relational Joins

被引:71
|
作者
Atserias, Albert [1 ]
Grohe, Martin [2 ]
Marx, Daniel [3 ]
机构
[1] Univ Politecn Cataluna, Barcelona, Spain
[2] Humboldt Univ, Berlin, Germany
[3] Budapest Univ Technol & Econ, Budapest, Hungary
关键词
D O I
10.1109/FOCS.2008.43
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Relational joins are at the core of relational algebra, which in turn is the core of the standard database query language SQL As their evaluation is expensive and very often dominated by the output size, it is an important task for database query optimisers to compute estimates on the size of joins and to find good execution plans for sequences of joins. We study these problems from a theoretical perspective, both in the worst-case model, and in an average-case model where the database is chosen according to a known probability distribution. In the former case, our first key observation is that the worst-case size of a query is characterised by the fractional edge cover number of its underlying hypergraph, a combinatorial parameter previously known to provide an upper bound. We complete the picture by proving a matching lower bound, and by showing that there exist queries for which the join-project plan suggested by the fractional edge cover approach may be substantially better than any join plan that does not use intermediate projections. On the other hand, we show that in the average-case model, every join-project plan can be turned into a plan containing no projections in such a way that the expected time to evaluate the plan increases only by a constant factor independent of the size of the database. Not surprisingly, the key combinatorial parameter in this context is the maximum density of the underlying hypergraph. We show how to make effective use of this parameter to eliminate the projections.
引用
收藏
页码:739 / +
页数:2
相关论文
共 50 条
  • [1] SIZE BOUNDS AND QUERY PLANS FOR RELATIONAL JOINS
    Atserias, Albert
    Grohe, Martin
    Marx, Daniel
    [J]. SIAM JOURNAL ON COMPUTING, 2013, 42 (04) : 1737 - 1767
  • [2] Robustness Metrics for Relational Query Execution Plans
    Wolf, Florian
    Brendle, Michael
    May, Norman
    Willems, Paul R.
    Sattler, Kai-Uwe
    Grossniklaus, Michael
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2018, 11 (11): : 1360 - 1372
  • [3] Query size estimation for joins using systematic sampling
    Ngu, AHH
    Harangsri, B
    Shepherd, J
    [J]. DISTRIBUTED AND PARALLEL DATABASES, 2004, 15 (03) : 237 - 275
  • [4] On the Calculation of Optimality Ranges for Relational Query Execution Plans
    Wolf, Florian
    May, Norman
    Willems, Paul R.
    Sattler, Kai-Uwe
    [J]. SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2018, : 663 - 675
  • [5] Query Size Estimation for Joins Using Systematic Sampling
    A.H.H. Ngu
    B. Harangsri
    J. Shepherd
    [J]. Distributed and Parallel Databases, 2004, 15 : 237 - 275
  • [6] A relational model for XML structural joins and their size estimations
    Luo, Cheng
    Jiang, Zhewei
    Hou, Wen-Chi
    Yan, Feng
    Zhu, Qiang
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2008, 16 (01) : 97 - 127
  • [7] A relational model for XML structural joins and their size estimations
    Cheng Luo
    Zhewei Jiang
    Wen-Chi Hou
    Feng Yan
    Qiang Zhu
    [J]. Knowledge and Information Systems, 2008, 16 : 97 - 127
  • [8] Size Bounds for Factorised Representations of Query Results
    Olteanu, Dan
    Zavodny, Jakub
    [J]. ACM TRANSACTIONS ON DATABASE SYSTEMS, 2015, 40 (01):
  • [9] \Bounds for Batch Codes with Restricted Query Size
    Zhang, Hui
    Skachek, Vitaly
    [J]. 2016 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, 2016, : 1192 - 1196
  • [10] Many-query join: efficient shared execution of relational joins on modern hardware
    Makreshanski, Darko
    Giannikis, Georgios
    Alonso, Gustavo
    Kossmann, Donald
    [J]. VLDB JOURNAL, 2018, 27 (05): : 669 - 692