Hit the Gym: Accelerating Query Execution to Efficiently Bootstrap Behavior Models for Self-Driving Database Management Systems

被引：0

作者：

Lim, Wan Shen ^{[1
]}

Ma, Lin ^{[2
]}

Zhang, William ^{[1
]}

Butrovich, Matthew ^{[1
]}

Arch, Samuel ^{[1
]}

Pavlo, Andrew ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA

[2] Univ Michigan, Ann Arbor, MI USA

来源：

PROCEEDINGS OF THE VLDB ENDOWMENT | 2024年 / 17卷 / 11期

关键词：

SELECTIVITY;

D O I：

10.14778/3681954.3682030

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Autonomous database management systems (DBMSs) aim to optimize themselves automatically without human guidance. They rely on machine learning (ML) models that predict their run-time behavior to evaluate whether a candidate configuration is beneficial without the expensive execution of queries. However, the high cost of collecting the training data to build these models makes them impractical for real-world deployments. Furthermore, these models are instance-specific and thus require retraining whenever the DBMS's environment changes. State-of-the-art methods spend over 93% of their time running queries for training versus tuning. To mitigate this problem, we present the Boot framework for automatically accelerating training data collection in DBMSs. Boot utilizes macro- and micro-acceleration (MMA) techniques that modify query execution semantics with approximate run-time telemetry and skip repetitive parts of the training process. To evaluate Boot, we integrated it into a database gym for PostgreSQL. Our experimental evaluation shows that Boot reduces training collection times by up to 268x with modest degradation in model accuracy. These results also indicate that our MMA-based approach scales with dataset size and workload complexity.

引用

页码：3680 / 3693

页数：14

共 3 条

[1] Query-based Workload Forecasting for Self-Driving Database Management Systems
Ma, Lin
Van Aken, Dana
Hefny, Ahmed
Mezerhane, Gustavo
Pavlo, Andrew
Gordon, Geoffrey J.
SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2018, : 631 - 645
[2] MB2: Decomposed Behavior Modeling for Self-Driving Database Management Systems
Ma, Lin
Zhang, William
Jiao, Jie
Wang, Wuwen
Butrovich, Matthew
Lim, Wan Shen
Menon, Prashanth
Pavlo, Andrew
SIGMOD '21: PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2021, : 1248 - 1261
[3] Tastes Great! Less Filling! High Performance and Accurate Training Data Collection for Self-Driving Database Management Systems
Butrovich, Matthew
Lim, Wan Shen
Ma, Lin
Rollinson, John
Zhang, William
Xia, Yu
Pavlo, Andrew
PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD '22), 2022, : 617 - 630

← 1 →