An Approach Based on Bayesian Networks for Query Selectivity Estimation

被引:5
|
作者
Halford, Max [1 ,2 ]
Saint-Pierre, Philippe [2 ]
Morvan, Franck [1 ]
机构
[1] Paul Sabatier Univ, IRIT Lab, Toulouse, France
[2] Paul Sabatier Univ, IMT Lab, Toulouse, France
关键词
Query optimisation; Cardinality estimation; Bayesian networks; COMPLEXITY; SIZE;
D O I
10.1007/978-3-030-18579-4_1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The efficiency of a query execution plan depends on the accuracy of the selectivity estimates given to the query optimiser by the cost model. The cost model makes simplifying assumptions in order to produce said estimates in a timely manner. These assumptions lead to selectivity estimation errors that have dramatic effects on the quality of the resulting query execution plans. A convenient assumption that is ubiquitous among current cost models is to assume that attributes are independent with each other. However, it ignores potential correlations which can have a huge negative impact on the accuracy of the cost model. In this paper we attempt to relax the attribute value independence assumption without unreasonably deteriorating the accuracy of the cost model. We propose a novel approach based on a particular type of Bayesian networks called Chow-Liu trees to approximate the distribution of attribute values inside each relation of a database. Our results on the TPC-DS benchmark show that our method is an order of magnitude more precise than other approaches whilst remaining reasonably efficient in terms of time and space.
引用
收藏
页码:3 / 19
页数:17
相关论文
共 50 条
  • [1] DMT: A flexible and versatile selectivity estimation approach for graph query
    Feng, JH
    Qian, Q
    Liao, YG
    Li, GL
    Ta, N
    ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2005, 3739 : 663 - 669
  • [2] The selectivity estimation of spatial query based on simple polygon
    Guo, P
    Chen, HZ
    Chen, ZL
    PROCEEDINGS OF THE 11TH JOINT INTERNATIONAL COMPUTER CONFERENCE, 2005, : 670 - 673
  • [3] A new approach to building histogram for selectivity estimation in query processing optimization
    Lu, Xin
    Guan, Jihong
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2009, 57 (06) : 1037 - 1047
  • [4] Selectivity Estimation in Web Query Optimization
    Shashidhar, H. R.
    Raju, G. T.
    Murthy, Vinayaka
    PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON INTERNET OF THINGS, DATA AND CLOUD COMPUTING (ICC 2017), 2017,
  • [5] Query selectivity estimation for uncertain data
    Singh, Sarvjeet
    Mayfield, Chris
    Shah, Rahul
    Prabhakar, Sunil
    SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, PROCEEDINGS, 2008, 5069 : 61 - 78
  • [6] Spatial selectivity estimation of window query
    Cheng, Changxiu
    Chen, Rongguo
    Zhu, Yanlu
    Wuhan Daxue Xuebao (Xinxi Kexue Ban)/ Geomatics and Information Science of Wuhan University, 2010, 35 (04): : 399 - 402
  • [7] Location estimation in wireless networks: A Bayesian approach
    Madigan, David
    Ju, Wen-Hua
    Krishnan, P.
    Krishnakumar, A. S.
    Zorych, Ivan
    STATISTICA SINICA, 2006, 16 (02) : 495 - 522
  • [8] A Bayesian multilevel modeling approach for data query in wireless sensor networks
    Wang, Honggang
    Fang, Hua
    Espy, Kimberly Andrew
    Peng, Dongming
    Sharif, Hamid
    COMPUTATIONAL SCIENCE - ICCS 2007, PT 3, PROCEEDINGS, 2007, 4489 : 859 - +
  • [9] Selectivity estimation by batch-query based histogram and parametric method
    Luo, Jizhou
    Zhou, Xiaofang
    Zhang, Yu
    Shen, Heng Tao
    Li, Jianzhong
    Conferences in Research and Practice in Information Technology Series, 2007, 63 : 93 - 102
  • [10] Query selectivity estimation via data mining
    Gryz, J
    Liang, DM
    INTELLIGENT INFORMATION PROCESSING AND WEB MINING, 2004, : 29 - 38