Optimizing main-memory join on modern hardware

被引:95
|
作者
Manegold, S [1 ]
Boncz, P [1 ]
Kersten, M [1 ]
机构
[1] CWI, NL-1098 SJ Amsterdam, Netherlands
关键词
main-memory databases; query processing; memory access optimization; decomposed storage model; join algorithms; implementation techniques;
D O I
10.1109/TKDE.2002.1019210
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the past decade, the exponential growth in commodity CPU's speed has far outpaced advances in memory latency. A second trend is that CPU performance advances are not only brought by increased clock rate, but also by increasing parallelism inside the CPU. Current database systems have not yet adapted to these trends and show poor utilization of both CPU and memory resources on current hardware. In this paper, we show how these resources can be optimized for large joins and translate these insights into guidelines for future database architectures, encompassing data structures, algorithms, cost modeling, and implementation. In particular, we discuss how vertically fragmented data structures optimize cache performance on sequential data access. On the algorithmic side, we refine the partitioned hash-join with a new partitioning algorithm called radix-cluster, which is specifically designed to optimize memory access. The performance of this algorithm is quantified using a detailed analytical model that incorporates memory access costs in terms of a limited number of parameters, such as cache sizes and miss penalties. We also present a calibration. tool that extracts such parameters automatically from any computer hardware. The accuracy of our models is proven by exhaustive experiments conducted with the Monet database system on three different hardware platforms. Finally, we investigate the effect of implementation techniques that optimize CPU resource usage. Our experiments show that large joins can be accelerated almost an-order of magnitude on modern RISC hardware when both memory and CPU resources are optimized.
引用
收藏
页码:709 / 730
页数:22
相关论文
共 50 条
  • [11] Energy Efficiency in Main-Memory Databases
    Stefan Noll
    Henning Funke
    Jens Teubner
    [J]. Datenbank-Spektrum, 2017, 17 (3) : 223 - 232
  • [12] A robust main-memory compression scheme
    Ekman, M
    Stenstrom, P
    [J]. 32ND INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, PROCEEDINGS, 2005, : 74 - 85
  • [13] Concurrency control in a main-memory DBMS
    Kim, SW
    [J]. COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2004, 19 (04): : 263 - 272
  • [14] DimmWitted: A Study of Main-Memory Statistical Analytics
    Zhang, Ce
    Re, Christopher
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2014, 7 (12): : 1283 - 1294
  • [15] The Architecture of the Dalí Main-Memory Storage Manager
    Philip Bohannon
    Daniel Lieuwen
    Rajeev Rastogi
    Avi Silberschatz
    S. Seshadri
    S. Sudarshan
    [J]. Multimedia Tools and Applications, 1997, 4 : 115 - 151
  • [16] CHOOSING AN OPTIMUM VERSION OF MAIN-MEMORY ALLOCATION
    SHVIDKAYA, GD
    [J]. AUTOMATION AND REMOTE CONTROL, 1989, 50 (11) : 1595 - 1599
  • [17] A PRACTICAL ARCHITECTURE OF DISTRIBUTED REAL-TIME MAIN-MEMORY DATABASES FOR MODERN SCADA SYSTEMS
    Dai Hong-Bin
    Jin Shu
    [J]. 2011 3RD INTERNATIONAL CONFERENCE ON COMPUTER TECHNOLOGY AND DEVELOPMENT (ICCTD 2011), VOL 1, 2012, : 27 - 31
  • [18] Adaptive Data Skipping in Main-Memory Systems
    Qin, Wilson
    Idreos, Stratos
    [J]. SIGMOD'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2016, : 2255 - 2256
  • [19] The architecture of the Dali main-memory storage manager
    Bohannon, P
    Lieuwen, D
    Rastogi, R
    Silberschatz, A
    Seshadri, S
    Sudarshan, S
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 1997, 4 (02) : 115 - 151
  • [20] An Interval Join Optimized for Modern Hardware
    Piatov, Danila
    Helmer, Sven
    Dignos, Anton
    [J]. 2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 1098 - 1109