Automating localized learning for cardinality estimation based on XGBoost

被引:0
|
作者
Feng, Jieming [1 ,2 ]
Li, Zhanhuai [1 ,2 ]
Chen, Qun [1 ,2 ]
Liu, Hailong [1 ,2 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci & Engn, Xian 710072, Peoples R China
[2] Northwestern Polytech Univ, Key Lab Big Data Storage & Management, Minist Ind & Informat Technol, Xian 710072, Peoples R China
基金
中国国家自然科学基金;
关键词
Self-driving DBMS; AI4DB; ML for cardinality estimation; Local models; Automation; SELECTIVITY ESTIMATION;
D O I
10.1007/s10115-024-02142-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For cardinality estimation in DBMS, building multiple local models instead of one global model can usually improve estimation accuracy as well as reducing the effort to label large amounts of training data. Unfortunately, the existing approach of localized learning requires users to explicitly specify which query patterns a local model can handle. Making these decisions is very arduous and error-prone for users; to make things worse, it limits the usability of local models. In this paper, we propose a localized learning solution for cardinality estimation based on XGBoost, which can automatically build an optimal combination of local models given a query workload. It consists of two phases: 1) model initialization; 2) model evolution. In the first phase, it clusters training data into a set of coarse-grained query pattern groups based on pattern similarity and constructs a separate local model for each group. In the second phase, it iteratively merges and splits clusters to identify an optimal combination by reconstructing local models. We formulate the problem of identifying the optimal combination of local models as a combinatorial optimization problem and present an efficient heuristic algorithm, named MMS (Models Merging and Splitting), for its solution due to its exponential complexity. Finally, we validate its performance superiority over the existing learning alternatives by extensive experiments on real datasets.
引用
下载
收藏
页码:3825 / 3854
页数:30
相关论文
共 50 条
  • [31] Aggregate-based Training Phase for ML-based Cardinality Estimation
    Woltmann, Lucas
    Hartmann, Claudio
    Habich, Dirk
    Lehner, Wolfgang
    Datenbank-Spektrum, 2022, 22 (01) : 45 - 57
  • [32] Deep and machine learning for daily streamflow estimation: a focus on LSTM, RFR and XGBoost
    Terzi, Oezlem
    Kuecueksille, Ecir Ugur
    Baykal, Tahsin
    Taylan, Dilek
    WATER PRACTICE AND TECHNOLOGY, 2023, 18 (10) : 2401 - 2414
  • [33] Automating Reinforcement Learning With Example-Based Resets
    Kim, Jigang
    Park, J. hyeon
    Cho, Daesol
    Kim, H. Jin
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (03) : 6606 - 6613
  • [34] Automating fish age estimation combining otolith images and deep learning: The role of multitask learning
    Politikos, Dimitris, V
    Petasis, Georgios
    Chatzispyrou, Archontia
    Mytilineou, Chryssi
    Anastasopoulou, Aikaterini
    FISHERIES RESEARCH, 2021, 242
  • [35] Fast and Accurate Cardinality Estimation in Cellular-Based Wireless Communications
    Khoshkholgh, Mohammad G.
    Leung, Victor C. M.
    Shin, Kang G.
    2015 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2015, : 1119 - 1123
  • [36] Accurate Sampling-Based Cardinality Estimation for Complex Graph Queries
    Hu, Pan
    Motik, Boris
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 2024, 49 (03):
  • [37] Per-Flow Cardinality Estimation Based On Virtual LogLog Sketching
    Zhou, Zeyu
    Hajek, Bruce
    2019 53RD ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2019,
  • [38] CARDINALITY ESTIMATION FOR RANDOM STOPPING SETS BASED ON POISSON POINT PROCESSES
    Privault, Nicolas
    ESAIM-PROBABILITY AND STATISTICS, 2021, 25 : 87 - 108
  • [39] An Efficient RFID Tag Cardinality Estimation Protocol Based on Bit Detection
    Zheng, Yao
    Wang, Xiaomei
    Yang, Dongyu
    Ding, Simiao
    2017 17TH IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY (ICCT 2017), 2017, : 602 - 606
  • [40] Prediction of formation fracture pressure based on reinforcement learning and XGBoost
    Wan, Bingqian
    Xu, Shengchi
    Luo, Shuai
    Wei, Leipeng
    Zhang, Ci
    Zhou, Diao
    Zhang, Hao
    Zhang, Yan
    Open Geosciences, 2024, 16 (01)