Geno: Cost-based Heterogeneous Fusion Query Optimizer

被引:0
|
作者
Tu Y.-F. [1 ,2 ]
Chen X.-Q. [2 ]
Zhou S.-J. [2 ]
Bian F.-S. [2 ]
Wu F. [2 ]
Chen B. [1 ]
机构
[1] College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing
[2] Zhongxing Telecommunication Equipment Corporation, Nanjing
来源
Ruan Jian Xue Bao/Journal of Software | 2022年 / 33卷 / 03期
关键词
Database; FPGA; GPU; Heterogeneous computing; Query optimization;
D O I
10.13328/j.cnki.jos.006441
中图分类号
学科分类号
摘要
The new hardware and its built environment have changed the traditional computing, storage and network systems, and also changed the previous design assumptions of the upper-level software. In particular, the heterogeneous computing architecture composed of general-purpose processors and dedicated accelerators has changed the design of the underlying framework of the database system and the cost model of query optimization. The database system needs to make adaptive adjustments to the characteristics of the new hardware to give full play to the potential of the new hardware. A cost-based query optimizer Geno for CPU/GPU/FPGA heterogeneous computing fusion is proposed, which can flexibly schedule and optimize the use of various computing resources. The main contribution is: finding that adjusting the cost parameters according to the actual hardware capabilities of the system environment can significantly improve the accuracy of the query plan, and proposing a calculation method and calibration tool for the cost of heterogeneous resources; through the estimation of the capabilities of heterogeneous hardware such as GPU and FPGA and the calibration of the actual capabilities of the database system hardware, establishing a cost model for query processing in a heterogeneous computing environment; implementing GPU operators and FPGA operators that support selection, projection, join and aggregation, realizing GPU operator pipeline design and FPGA operator pipeline design; solving the operator assignment and scheduling through cost-based evaluation, and generating a heterogeneous collaborative execution plan to realize the collaborative optimization of heterogeneous computing resources to makes full use of the advantages of each heterogeneous resource. Experiments show that the parameter values calibrated by Geno are more compatible with the actual hardware capabilities. Compared with PostgreSQL and GPU database HeteroDB, Geno can generate a more reasonable query plan. In the TPC-H scenario, the execution time of Geno in the case of row storage is 64%-93% less than that of Postgresql, and 1% to 39% less than that of Hetero-DB; in the case of column storage, Geno’s execution time is 87%-92% less than that of Postgresql, and 1%-81% less than that of Hetero-DB; Compared with row storage, Geno reduces query execution time 32%-89% in the case of column storage. © Copyright 2022, Institute of Software, the Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:774 / 796
页数:22
相关论文
共 25 条
  • [1] Pei W, Li ZH, Pan W., Survey of key technologies in GPU database system, Ruan Jian Xue Bao/Journal of Software, 32, 3, pp. 859-885, (2021)
  • [2] Papaphilippou P, Luk W., Accelerating database systems using FPGAs: A survey, Proc. of the 28th Int’l Conf. on Field Programmable Logic and Applications (FPL), pp. 1312-1317, (2018)
  • [3] Bordawekar RR, Sadoghi M., Accelerating database workloads by software-hardware-system co-design, Proc. of the 2016 IEEE 32nd Int’l Conf. on Data Engineering (ICDE), pp. 1428-1431, (2016)
  • [4] Ailamaki A., Databases and hardware: The beginning and sequel of a beautiful friendship, Proc. of the VLDB Endowment, 8, 12, pp. 2058-2061, (2015)
  • [5] Yu XY, Bezerra G, Pavlo A, Devadas S, Stonebraker M., Staring into the abyss: An evaluation of concurrency control with one thousand cores, Proc. of the VLDB Endowment, 8, 3, pp. 209-220, (2014)
  • [6] Ibaraki T, Kameda T., On the optimal nesting order for computing N-relational joins, ACM Trans. on Database Systems, 9, 3, pp. 482-502, (1984)
  • [7] Bausch D, Petrov I, Buchmann A., Making cost-based query optimization asymmetry-aware, Proc. of the 8th Int’l Workshop on Data Management on New Hardware, pp. 24-32, (2012)
  • [8] Balkesen C, Teubner J, Alonso G, Tamerozsu M., Main-memory hash joins on multi-core CPUs: Tuning to the underlying hardware, Proc. of the 2013 IEEE 29th Int’l Conf. on Data Engineering (ICDE), pp. 362-373, (2013)
  • [9] He J, Zhang SH, He BS., In-cache query co-processing on coupled CPU-GPU architectures, Proc. of the VLDB Endowment, 8, 4, pp. 329-340, (2014)
  • [10] Cheng XT, He BS, Lu M, Lau CT, Huynh HP, Goh RSM., Efficient query processing on many-core architectures: A case study with Intel Xeon Phi processor, Proc. of the 2016 Int’l Conf. on Management of Data, pp. 2081-2084, (2016)