FACE: A Normalizing Flow based Cardinality Estimator

被引:24
|
作者
Wang, Jiayi [1 ]
Chai, Chengliang [1 ]
Liu, Jiabin [1 ]
Li, Guoliang [1 ]
机构
[1] Tsinghua Univ, Beijing, Peoples R China
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2021年 / 15卷 / 01期
基金
中国博士后科学基金;
关键词
PREDICTION;
D O I
10.14778/3485450.3485458
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cardinality estimation is one of the most important problems in query optimization. Recently, machine learning based techniques have been proposed to effectively estimate cardinality, which can be broadly classified into query-driven and data-driven approaches. Query-driven approaches learn a regression model from a query to its cardinality; while data-driven approaches learn a distribution of tuples, select some samples that satisfy a SQL query, and use the data distributions of these selected tuples to estimate the cardinality of the SQL query. As query-driven methods rely on training queries, the estimation quality is not reliable when there are no high-quality training queries; while data-driven methods have no such limitation and have high adaptivity. In this work, we focus on data-driven methods. A good datadriven model should achieve three optimization goals. First, the model needs to capture data dependencies between columns and support large domain sizes (achieving high accuracy). Second, the model should achieve high inference efficiency, because many data samples are needed to estimate the cardinality (achieving low inference latency). Third, the model should not be too large (achieving a small model size). However, existing data-driven methods cannot simultaneously optimize the three goals. To address the limitations, we propose a novel cardinality estimator FACE, which leverages the Normalizing Flow based model to learn a continuous joint distribution for relational data. FACE can transform a complex distribution over continuous random variables into a simple distribution (e.g., multivariate normal distribution), and use the probability density to estimate the cardinality. First, we design a dequantization method to make data more "continuous". Second, we propose encoding and indexing techniques to handle Like predicates for string data. Third, we propose a Monte Carlo method to efficiently estimate the cardinality. Experimental results show that our method significantly outperforms existing approaches in terms of estimation accuracy while keeping similar latency and model size.
引用
收藏
页码:72 / 84
页数:13
相关论文
共 50 条
  • [1] Cardinality estimation using normalizing flow
    Jiayi Wang
    Chengliang Chai
    Jiabin Liu
    Guoliang Li
    The VLDB Journal, 2024, 33 (2) : 323 - 348
  • [2] Cardinality estimation using normalizing flow
    Wang, Jiayi
    Chai, Chengliang
    Liu, Jiabin
    Li, Guoliang
    VLDB JOURNAL, 2024, 33 (02): : 323 - 348
  • [3] A Cardinality Estimator in Complex Database Systems Based on TreeLSTM
    Qi, Kaiyang
    Yu, Jiong
    He, Zhenzhen
    SENSORS, 2023, 23 (17)
  • [4] GACE: Graph-Attention-Network-Based Cardinality Estimator
    Zhu, Daobing
    He, Dongsheng
    Fan, Shuhuan
    Liao, Jianming
    Hou, Mengshu
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2021, PT II, 2021, 12924 : 332 - 345
  • [5] Cardinality and Cost Estimator Based on Tree Gated Recurrent Unit
    Qiao S.-J.
    Yang G.-P.
    Han N.
    Qu L.-L.
    Chen H.
    Mao R.
    Yuan C.-A.
    Gutierrez L.A.
    Ruan Jian Xue Bao/Journal of Software, 2022, 33 (03): : 797 - 813
  • [6] Normalizing Cardinality Rules Using Merging and Sorting Constructions
    Bomanson, Jori
    Janhunen, Tomi
    LOGIC PROGRAMMING AND NONMONOTONIC REASONING (LPNMR 2013), 2013, 8148 : 187 - 199
  • [7] A BiLSTM cardinality estimator in complex database systems based on attention mechanism
    Zhou, Qiang
    Yang, Guoping
    Song, Haiquan
    Guo, Jin
    Zhang, Yadong
    Wei, Shengjie
    Qu, Lulu
    Gutierrez, Louis Alberto
    Qiao, Shaojie
    CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2022, 7 (03) : 537 - 546
  • [8] NeuroCard: One Cardinality Estimator for All Tables
    Yang, Zongheng
    Kamsetty, Amog
    Luan, Sifei
    Liang, Eric
    Duan, Yan
    Chen, Xi
    Stoica, Ion
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 14 (01): : 61 - 73
  • [9] ALECE: An Attention-based Learned Cardinality Estimator for SPJ Queries on Dynamic Workloads
    Li, Pengfei
    Wei, Wenqing
    Zhu, Rong
    Ding, Bolin
    Zhou, Jingren
    Lu, Hua
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2023, 17 (02): : 197 - 210
  • [10] Robust Cardinality Estimator by Non-autoregressive Model
    Ito, Ryuichi
    Xiao, Chuan
    Onizuka, Makoto
    SOFTWARE FOUNDATIONS FOR DATA INTEROPERABILITY, SFDI 2021, 2022, 1457 : 55 - 61