Robust Join Processing with Diamond Hardened Joins

被引：0

作者：

Birler, Altan ^{[1
]}

Kemper, Alfons ^{[1
]}

Neumann, Thomas ^{[1
]}

机构：

[1] Tech Univ Munich, Munich, Germany

来源：

PROCEEDINGS OF THE VLDB ENDOWMENT | 2024年 / 17卷 / 11期

关键词：

QUERY PLANS; LOOKING;

D O I：

10.14778/3681954.3681995

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Join ordering and join processing has a huge impact on query execution and can easily affect the query response time by orders of magnitude. In particular, when joins are potentially growing n:m joins, execution can be very expensive. This can be seen by examining the sizes of intermediate results: If a join query produces many redundant tuples that are later eliminated, the query is likely expensive, which is not justified by the query result. This gives the query a diamond shape, with intermediate results larger than the inputs and the output. This occurs frequently in various workloads, particularly, in graph workloads, and also in benchmarks like JOB. We call this issue the diamond problem, and to address it, we propose the diamond hardened join framework, which splits join operators into two suboperators: Lookup & Expand. By allowing these suboperators to be freely reordered by the query optimizer, we improve the runtime of queries that exhibit the diamond problem without sacrificing performance for the rest of the queries. Past theoretical work such as worst-case optimal joins similarly try to avoid huge intermediate results. However, these approaches have significant overheads that impact all queries. We demonstrate that our approach leads to excellent performance both in queries that exhibit the diamond problem and in regular queries that can be handled by traditional binary joins. This allows for a unified approach, offering excellent performance across the board. Compared to traditional joins, queries' performance is improved by up to 500x in the CE benchmark and remains excellent in TPC-H and JOB.

引用

页码：3215 / 3228

页数：14

共 50 条

[1] Fast joins using join indices
Zhe Li
Kenneth A. Ross
The VLDB Journal, 1999, 8 : 1 - 24
[2] Wander Join: Online Aggregation for Joins
Li, Feifei
Wu, Bin
Yi, Ke
Zhao, Zhuoyue
SIGMOD'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2016, : 2121 - 2124
[3] Fast joins using join indices
Li, Z
Ross, KA
VLDB JOURNAL, 1999, 8 (01): : 1 - 24
[4] Faster joins, self-joins and multi-way joins using join indices
Lei, H
Ross, KA
DATA & KNOWLEDGE ENGINEERING, 1999, 29 (02) : 179 - 200
[5] Faster joins, self-joins and multi-way joins using join indices
Lei, H
Ross, KA
DATA & KNOWLEDGE ENGINEERING, 1998, 28 (03) : 277 - 298
[6] Faster joins, self-joins and multi-way joins using join indices
Lei, Hui
Ross, Kenneth A.
Data and Knowledge Engineering, 1999, 29 (02): : 179 - 200
[7] Evaluation of main memory join algorithms for joins with subset join predicates
Helmer, S
Moerkotte, G
PROCEEDINGS OF THE TWENTY-THIRD INTERNATIONAL CONFERENCE ON VERY LARGE DATABASES, 1997, : 386 - 395
[8] To Join or Not to Join? Thinking Twice about Joins before Feature Selection
Kumar, Arun
Naughton, Jeffrey
Patel, Jignesh M.
Zhu, Xiaojin
SIGMOD'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2016, : 19 - 34
[9] Block-Join: A Partition-Based Method for Processing Spatio-Temporal Joins
Li, Ting
Xu, Jianqiu
WEB AND BIG DATA, PT III, APWEB-WAIM 2022, 2023, 13423 : 397 - 411
[10] INFINITE JOINS THAT ARE FINITELY JOIN-IRREDUCIBLE
BERGMAN, GM
ZIMMERMANNHUISGEN, B
ORDER-A JOURNAL ON THE THEORY OF ORDERED SETS AND ITS APPLICATIONS, 1990, 7 (01): : 27 - 40

← 1 2 3 4 5 →