Optimizing main-memory join on modern hardware

被引:95
|
作者
Manegold, S [1 ]
Boncz, P [1 ]
Kersten, M [1 ]
机构
[1] CWI, NL-1098 SJ Amsterdam, Netherlands
关键词
main-memory databases; query processing; memory access optimization; decomposed storage model; join algorithms; implementation techniques;
D O I
10.1109/TKDE.2002.1019210
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the past decade, the exponential growth in commodity CPU's speed has far outpaced advances in memory latency. A second trend is that CPU performance advances are not only brought by increased clock rate, but also by increasing parallelism inside the CPU. Current database systems have not yet adapted to these trends and show poor utilization of both CPU and memory resources on current hardware. In this paper, we show how these resources can be optimized for large joins and translate these insights into guidelines for future database architectures, encompassing data structures, algorithms, cost modeling, and implementation. In particular, we discuss how vertically fragmented data structures optimize cache performance on sequential data access. On the algorithmic side, we refine the partitioned hash-join with a new partitioning algorithm called radix-cluster, which is specifically designed to optimize memory access. The performance of this algorithm is quantified using a detailed analytical model that incorporates memory access costs in terms of a limited number of parameters, such as cache sizes and miss penalties. We also present a calibration. tool that extracts such parameters automatically from any computer hardware. The accuracy of our models is proven by exhaustive experiments conducted with the Monet database system on three different hardware platforms. Finally, we investigate the effect of implementation techniques that optimize CPU resource usage. Our experiments show that large joins can be accelerated almost an-order of magnitude on modern RISC hardware when both memory and CPU resources are optimized.
引用
收藏
页码:709 / 730
页数:22
相关论文
共 50 条
  • [1] Exploiting Hardware Transactional Memory in Main-Memory Databases
    Leis, Viktor
    Kemper, Alfons
    Neumann, Thomas
    [J]. 2014 IEEE 30TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2014, : 580 - 591
  • [2] HyPer Beyond Software: Exploiting Modern Hardware for Main-Memory Database Systems
    Florian Funke
    Alfons Kemper
    Tobias Mühlbauer
    Thomas Neumann
    Viktor Leis
    [J]. Datenbank-Spektrum, 2014, 14 (3) : 173 - 181
  • [3] Modern Main-Memory Database Systems
    Larson, Per-Ake
    Levandoski, Justin
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2016, 9 (13): : 1609 - +
  • [4] Optimization of GPU-Based Main-Memory Hash Join
    Li, Guo-hua
    Ren, Yu-qi
    Luo, Can
    Huang, Jin
    Deng, Yang-dong
    [J]. 2017 2ND INTERNATIONAL CONFERENCE ON COMPUTATIONAL MODELING, SIMULATION AND APPLIED MATHEMATICS (CMSAM), 2017, : 489 - 494
  • [5] Main-Memory Hash Joins on Modern Processor Architectures
    Balkesen, Cagri
    Teubner, Jens
    Alonso, Gustavo
    Oezsu, M. Tamer
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (07) : 1754 - 1766
  • [6] Efficient main-memory algorithms for set containment join using inverted lists
    Shaporenkov, D
    [J]. ADVANCES IN DATABASES AND INFORMATION SYSTEMS, PROCEEDINGS, 2005, 3631 : 139 - 152
  • [7] PolyHJ: A Polymorphic Main-Memory Hash Join Paradigm for Multi-Core Machines
    Khattab, Omar
    Hammoud, Mohammad
    Shekfeh, Omar
    [J]. CIKM'18: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2018, : 1323 - 1332
  • [8] Main-Memory Database Systems
    Kemper, Alfons
    Neumann, Thomas
    [J]. 2014 IEEE 30TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2014, : 1310 - 1310
  • [9] Main-Memory Hash Joins on Multi-Core CPUs: Tuning to the Underlying Hardware
    Balkesen, Cagri
    Teubner, Jens
    Alonso, Gustavo
    Oezsu, M. Tamer
    [J]. 2013 IEEE 29TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2013, : 362 - 373
  • [10] Forecasting the Cost of Processing Multi-join Queries via Hashing for Main-memory Databases
    Liu, Feilong
    Blanas, Spyros
    [J]. ACM SOCC'15: PROCEEDINGS OF THE SIXTH ACM SYMPOSIUM ON CLOUD COMPUTING, 2015, : 153 - 166