Hit the Gym: Accelerating Query Execution to Efficiently Bootstrap Behavior Models for Self-Driving Database Management Systems

被引:0
|
作者
Lim, Wan Shen [1 ]
Ma, Lin [2 ]
Zhang, William [1 ]
Butrovich, Matthew [1 ]
Arch, Samuel [1 ]
Pavlo, Andrew [1 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[2] Univ Michigan, Ann Arbor, MI USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2024年 / 17卷 / 11期
关键词
SELECTIVITY;
D O I
10.14778/3681954.3682030
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Autonomous database management systems (DBMSs) aim to optimize themselves automatically without human guidance. They rely on machine learning (ML) models that predict their run-time behavior to evaluate whether a candidate configuration is beneficial without the expensive execution of queries. However, the high cost of collecting the training data to build these models makes them impractical for real-world deployments. Furthermore, these models are instance-specific and thus require retraining whenever the DBMS's environment changes. State-of-the-art methods spend over 93% of their time running queries for training versus tuning. To mitigate this problem, we present the Boot framework for automatically accelerating training data collection in DBMSs. Boot utilizes macro- and micro-acceleration (MMA) techniques that modify query execution semantics with approximate run-time telemetry and skip repetitive parts of the training process. To evaluate Boot, we integrated it into a database gym for PostgreSQL. Our experimental evaluation shows that Boot reduces training collection times by up to 268x with modest degradation in model accuracy. These results also indicate that our MMA-based approach scales with dataset size and workload complexity.
引用
收藏
页码:3680 / 3693
页数:14
相关论文
共 3 条
  • [1] Query-based Workload Forecasting for Self-Driving Database Management Systems
    Ma, Lin
    Van Aken, Dana
    Hefny, Ahmed
    Mezerhane, Gustavo
    Pavlo, Andrew
    Gordon, Geoffrey J.
    SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2018, : 631 - 645
  • [2] MB2: Decomposed Behavior Modeling for Self-Driving Database Management Systems
    Ma, Lin
    Zhang, William
    Jiao, Jie
    Wang, Wuwen
    Butrovich, Matthew
    Lim, Wan Shen
    Menon, Prashanth
    Pavlo, Andrew
    SIGMOD '21: PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2021, : 1248 - 1261
  • [3] Tastes Great! Less Filling! High Performance and Accurate Training Data Collection for Self-Driving Database Management Systems
    Butrovich, Matthew
    Lim, Wan Shen
    Ma, Lin
    Rollinson, John
    Zhang, William
    Xia, Yu
    Pavlo, Andrew
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD '22), 2022, : 617 - 630