Automating localized learning for cardinality estimation based on XGBoost

被引:0
|
作者
Feng, Jieming [1 ,2 ]
Li, Zhanhuai [1 ,2 ]
Chen, Qun [1 ,2 ]
Liu, Hailong [1 ,2 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci & Engn, Xian 710072, Peoples R China
[2] Northwestern Polytech Univ, Key Lab Big Data Storage & Management, Minist Ind & Informat Technol, Xian 710072, Peoples R China
基金
中国国家自然科学基金;
关键词
Self-driving DBMS; AI4DB; ML for cardinality estimation; Local models; Automation; SELECTIVITY ESTIMATION;
D O I
10.1007/s10115-024-02142-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For cardinality estimation in DBMS, building multiple local models instead of one global model can usually improve estimation accuracy as well as reducing the effort to label large amounts of training data. Unfortunately, the existing approach of localized learning requires users to explicitly specify which query patterns a local model can handle. Making these decisions is very arduous and error-prone for users; to make things worse, it limits the usability of local models. In this paper, we propose a localized learning solution for cardinality estimation based on XGBoost, which can automatically build an optimal combination of local models given a query workload. It consists of two phases: 1) model initialization; 2) model evolution. In the first phase, it clusters training data into a set of coarse-grained query pattern groups based on pattern similarity and constructs a separate local model for each group. In the second phase, it iteratively merges and splits clusters to identify an optimal combination by reconstructing local models. We formulate the problem of identifying the optimal combination of local models as a combinatorial optimization problem and present an efficient heuristic algorithm, named MMS (Models Merging and Splitting), for its solution due to its exponential complexity. Finally, we validate its performance superiority over the existing learning alternatives by extensive experiments on real datasets.
引用
下载
收藏
页码:3825 / 3854
页数:30
相关论文
共 50 条
  • [21] Cardinality estimation of activity trajectory similarity queries using deep learning
    Tian, Ruijie
    Zhang, Weishi
    Wang, Fei
    Zhou, Jingchun
    Alhudhaif, Adi
    Alenezi, Fayadh
    INFORMATION SCIENCES, 2023, 646
  • [22] Cardinality Estimation in a Virtualized Network Device Using Online Machine Learning
    Cohen, Reuven
    Nezri, Yuval
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2019, 27 (05) : 2098 - 2110
  • [23] Sample-Efficient Cardinality Estimation Using Geometric Deep Learning
    Reiner, Silvan
    Grossniklaus, Michael
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2023, 17 (04): : 740 - 752
  • [24] Incremental Locally Weighted Learning for Adaptive Cardinality Estimation of Query Template
    Feng J.-M.
    Li Z.-H.
    Chen Q.
    Chen Z.-Q.
    Jisuanji Xuebao/Chinese Journal of Computers, 2022, 45 (01): : 17 - 34
  • [25] Estimation of soil temperature based on XGBoost and LSTM methods
    Li, Qing-Liang
    Cai, Kai-Xuan
    Geng, Qing-Tian
    Liu, Guang-Jie
    Sun, Ming-Yu
    Zhang, Yu
    Yu, Fan-Hua
    Guangxue Jingmi Gongcheng/Optics and Precision Engineering, 2020, 28 (10): : 2337 - 2348
  • [26] Network Host Cardinality Estimation Based on Artificial Neural Network
    Jie, Xu
    Lan Haoliang
    Wei, Ding
    Ao, Ju
    SECURITY AND COMMUNICATION NETWORKS, 2022, 2022
  • [27] Cardinality Estimation: An Experimental Survey
    Harmouch, Hazar
    Naumann, Felix
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2017, 11 (04): : 499 - 512
  • [28] Cardinality estimation for property graph queries with gated learning approach on the graph database
    Zhenzhen He
    Jiong Yu
    Xusheng Du
    Binglei Guo
    Ziyang Li
    Zhe Li
    Multimedia Tools and Applications, 2025, 84 (11) : 9159 - 9183
  • [29] Deep Unsupervised Cardinality Estimation
    Yang, Zongheng
    Liang, Eric
    Kamsetty, Amog
    Wu, Chenggang
    Duan, Yan
    Chen, Xi
    Abbeel, Pieter
    Hellerstein, Joseph M.
    Krishnan, Sanjay
    Stoica, Ion
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2019, 13 (03): : 279 - 292
  • [30] A Unified Deep Model of Learning from both Data and Queries for Cardinality Estimation
    Wu, Peizhi
    Cong, Gao
    SIGMOD '21: PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2021, : 2009 - 2022