Selectivity Estimation of Inequality Joins in Databases

被引:1
|
作者
Repas, Diogo [1 ]
Luo, Zhicheng [1 ]
Schoemans, Maxime [1 ]
Sakr, Mahmoud [1 ,2 ]
机构
[1] Univ libre Bruxelles ULB, Data Sci Lab, B-1050 Brussels, Belgium
[2] Ain Shams Univ, Fac Comp & Informat Sci, Cairo 11566, Egypt
关键词
SQL; query optimization; optimizer statistics;
D O I
10.3390/math11061383
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Selectivity estimation refers to the ability of the SQL query optimizer to estimate the size of the results of a predicate in the query. It is the main calculation based on which the optimizer can select the least expensive plan to execute. While the problem has been known since the mid-1970s, we were surprised that there are no solutions in the literature for the selectivity estimation of inequality joins. By testing four common database systems: Oracle, SQL-Server, PostgreSQL, and MySQL, we found that the open-source systems PostgreSQL and MySQL lack this estimation. Oracle and SQL-Server make fairly accurate estimations, yet their algorithms are secret. This paper, thus, proposes an algorithm for inequality join selectivity estimation. The proposed algorithm was implemented in PostgreSQL and sent as a patch to be included in the next releases. We compared this implementation with the above DBMS for three different data distributions (uniform, normal, and Zipfian) and showed that our algorithm provides extremely accurate estimations (below 0.1% average error), outperforming the other systems by an order of magnitude.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] Optimal Re-encryption Strategy for Joins in Encrypted Databases
    Kerschbaum, Florian
    Haerterich, Martin
    Grofig, Patrick
    Kohler, Mathias
    Schaad, Andreas
    Schroepfer, Axel
    Tighzert, Walter
    DATA AND APPLICATIONS SECURITY AND PRIVACY XXVII, 2013, 7964 : 195 - 210
  • [22] Approximate processing of multiway spatial joins in very large databases
    Papadias, D
    Arkoumanis, D
    ADVANCES IN DATABASE TECHNOLOGY - EDBT 2002, 2002, 2287 : 179 - 196
  • [23] OPTIMIZING JOINS BETWEEN 2 PARTITIONED RELATIONS IN DISTRIBUTED DATABASES
    CERI, S
    GOTTLOB, G
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1986, 3 (02) : 183 - 205
  • [24] Strengthening the Security of Encrypted Databases: Non-transitive JOINs
    Mironov, Ilya
    Segev, Gil
    Shahaf, Ido
    THEORY OF CRYPTOGRAPHY, TCC 2017, PT II, 2017, 10678 : 631 - 661
  • [25] Improvement of Join Algorithms for Low-Selectivity Joins on MapReduce
    Matono, Akiyoshi
    Ogawa, Hirotaka
    Kojima, Isao
    DATABASES THEORY AND APPLICATIONS, 2015, 9093 : 117 - 128
  • [26] SIMD Accelerates the Probe Phase of Star Joins in Main Memory Databases
    Fang, Zhuhe
    He, Zeyu
    Chu, Jiajia
    Weng, Chuliang
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 2019, 11448 : 476 - 480
  • [28] Mixing selections and foreign key joins in queries against possibilistic databases
    Bosc, P
    Pivert, O
    FOUNDATIONS OF INTELLIGENT SYSTEMS, PROCEEDINGS, 2002, 2366 : 194 - 202
  • [30] Migration selectivity and the evolution of spatial inequality
    Kanbur, R
    Rapoport, H
    JOURNAL OF ECONOMIC GEOGRAPHY, 2005, 5 (01) : 43 - 57