Elastic Pipelining in an In-Memory Database Cluster

被引:14
|
作者
Wang, Li [1 ,2 ]
Zhou, Minqi [1 ]
Zhang, Zhenjie [2 ]
Yang, Yin [3 ]
Zhou, Aoying [1 ]
Bitton, Dina [4 ]
机构
[1] East China Normal Univ, Inst Data Sci & Engn, Shanghai, Peoples R China
[2] Illinois Singapore Pte Ltd, Adv Digital Sci Ctr, Singapore, Singapore
[3] Hamad Bin Khalifa Univ, Coll Sci & Engn, Doha, Qatar
[4] Bitton Consulting, Los Angeles, CA USA
基金
美国国家科学基金会;
关键词
JOIN;
D O I
10.1145/2882903.2882904
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An in-memory database cluster consists of multiple interconnected nodes with a large capacity of RAM and modern multi-core CPUs. As a conventional query processing strategy, pipelining remains a promising solution for in-memory parallel database systems, as it avoids expensive intermediate result materialization and parallelizes the data processing among nodes. However, to fully unleash the power of pipelining in a cluster with multi-core nodes, it is crucial for the query optimizer to generate good query plans with appropriate intra-node parallelism, in order to maximize CPU and network bandwidth utilization. A suboptimal plan, on the contrary, causes load imbalance in the pipelines and consequently degrades the query performance. Parallelism assignment optimization at compile time is nearly impossible, as the workload in each node is affected by numerous factors and is highly dynamic during query evaluation. To tackle this problem, we propose elastic pipelining, which makes it possible to optimize intra-node parallelism assignments in the pipelines based on the actual workload at runtime. It is achieved with the adoption of new elastic iterator model and a fully optimized dynamic scheduler. The elastic iterator model generally upgrades traditional iterator model with new dynamic multi-core execution adjustment capability. And the dynamic scheduler efficiently provisions CPU cores to query execution segments in the pipelines based on the light-weight measurements on the operators. Extensive experiments on real and synthetic (TPC-H) data show that our proposal achieves almost full CPU utilization on typical decision-making analytical queries, outperforming state-of-the-art open-source systems by a huge margin.
引用
收藏
页码:1279 / 1294
页数:16
相关论文
共 50 条
  • [1] Elastic Use of Far Memory for In-Memory Database Management Systems
    Lee, Donghun
    Ahn, Minseon
    Kim, Jungmin
    Booss, Daniel
    Ritter, Daniel
    Rebholz, Oliver
    Willhalm, Thomas
    Desai, Suprasad Mutalik
    Singh, Navneet
    [J]. 19TH INTERNATIONAL WORKSHOP ON DATA MANAGEMENT ON NEW HARDWARE, DAMON 2023, 2023, : 35 - 43
  • [2] Oracle Database In-Memory: A Dual Format In-Memory Database
    Lahiri, Tirthankar
    Chavan, Shasank
    Colgan, Maria
    Das, Dinesh
    Ganesh, Amit
    Gleeson, Mike
    Hase, Sanket
    Holloway, Allison
    Kamp, Jesse
    Lee, Teck-Hua
    Loaiza, Juan
    Macnaughton, Neil
    Marwah, Vineet
    Mukherjee, Niloy
    Mullick, Atrayee
    Muthulingam, Sujatha
    Raja, Vivekanandhan
    Roth, Marty
    Soylemez, Ekrem
    Zait, Mohamed
    [J]. 2015 IEEE 31ST INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2015, : 1253 - 1258
  • [3] Predicting In-Memory Database Performance for Automating Cluster Management Tasks
    Schaffner, Jan
    Eckart, Benjamin
    Jacobs, Dean
    Schwarz, Christian
    Plattner, Hasso
    Zeier, Alexander
    [J]. IEEE 27TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2011), 2011, : 1264 - 1275
  • [4] Benchmarking in-memory database
    Cheqing Jin
    Yangxin Kong
    Qiangqiang Kang
    Weining Qian
    Aoying Zhou
    [J]. Frontiers of Computer Science, 2016, 10 : 1067 - 1081
  • [5] Benchmarking in-memory database
    Cheqing JIN
    Yangxin KONG
    Qiangqiang KANG
    Weining QIAN
    Aoying ZHOU
    [J]. Frontiers of Computer Science., 2016, 10 (06) - 1081
  • [6] In-Memory Database Query
    Giannopoulos, Iason
    Singh, Abhairaj
    Le Gallo, Manuel
    Jonnalagadda, Vara Prasad
    Hamdioui, Said
    Sebastian, Abu
    [J]. ADVANCED INTELLIGENT SYSTEMS, 2020, 2 (12)
  • [7] Benchmarking in-memory database
    Jin, Cheqing
    Kong, Yangxin
    Kang, Qiangqiang
    Qian, Weining
    Zhou, Aoying
    [J]. FRONTIERS OF COMPUTER SCIENCE, 2016, 10 (06) : 1067 - 1081
  • [8] In-memory parallelism for database workloads
    Trancoso, P
    [J]. EURO-PAR 2002 PARALLEL PROCESSING, PROCEEDINGS, 2002, 2400 : 532 - 542
  • [9] Replicated Layout for In-Memory Database Systems
    Sudhir, Sivaprasad
    Cafarella, Michael
    Madden, Samuel
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 15 (04): : 984 - 997
  • [10] In-memory database acceleration on FPGAs: a survey
    Jian Fang
    Yvo T. B. Mulder
    Jan Hidders
    Jinho Lee
    H. Peter Hofstee
    [J]. The VLDB Journal, 2020, 29 : 33 - 59