Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-Core CPUs

被引:158
|
作者
Kim, Changkyu [1 ]
Sedlar, Eric [2 ]
Chhugani, Jatin [1 ]
Kaldewey, Tim [2 ]
Nguyen, Anthony D. [1 ]
Di Bias, Andrea [2 ]
Lee, Victor W. [1 ]
Satish, Nadathur [1 ]
Dubey, Pradeep [1 ]
机构
[1] Intel Corp, Throughput Comp Lab, Santa Clara, CA 95054 USA
[2] Oracle Corp, Special Projects Grp, Redwood Shores, CA 94065 USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2009年 / 2卷 / 02期
关键词
D O I
10.14778/1687553.1687564
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Join is an important database operation. As computer architectures evolve, the best join algorithm may change hand. This paper reexamines two popular join algorithms - hash join and sort-merge join - to determine if the latest computer architecture trends shift the tide that has favored hash join for many years. For a fair comparison, we implemented the most optimized parallel version of both algorithms on the latest Intel Core i7 platform. Both implementations scale well with the number of cores in the system and take advantages of latest processor features for performance. Our hash-based implementation achieves more than 100M tuples per second which is 17X faster than the best reported performance on CPUs and 8X faster than that reported for GPUs. Moreover, the performance of our hash join implementation is consistent over a wide range of input data sizes from 64K to 128M tuples and is not affected by data skew. We compare this implementation to our highly optimized sort-based implementation that achieves 47M to 80M tuples per second. We developed analytical models to study how both algorithms would scale with upcoming processor architecture trends. Our analysis projects that current architectural trends of wider SIMD, more cores, and smaller memory bandwidth per core imply better scalability potential for sort-merge join. Consequently, sort- merge join is likely to outperform hash join on upcoming chip multiprocessors. In summary, we offer multicoreimplementations of hash join and sort-merge join which consistently outperform all previously reported results. We further conclude that the tide that favors the hash join algorithm has not changed yet, but the change is just around the comer.
引用
收藏
页码:1378 / 1389
页数:12
相关论文
共 32 条
  • [1] Multi-Core, Main-Memory Joins: Sort vs. Hash Revisited
    Balkesen, Cagri
    Alonso, Gustavo
    Teubner, Jens
    Oezsu, M. Tamer
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2013, 7 (01): : 85 - 96
  • [2] Optimizing Hash Join with MapReduce on Multi-Core CPUs
    Yuan, Tong
    Liu, Zhijing
    Liu, Hui
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (05): : 1316 - 1325
  • [3] Optimized merge sort on modern commodity multi-core CPUs
    Xu, Ming
    Xu, Xianbin
    Yin, MengJia
    Zheng, Fang
    Telkomnika (Telecommunication Computing Electronics and Control), 2016, 14 (01) : 309 - 318
  • [4] A Parallel SPH Implementation on Multi-Core CPUs
    Ihmsen, Markus
    Akinci, Nadir
    Becker, Markus
    Teschner, Matthias
    COMPUTER GRAPHICS FORUM, 2011, 30 (01) : 99 - 112
  • [5] Optimization Strategy of Bidirectional Join Enumeration in Multi-Core CPUS
    Chen, Yongheng
    Zuo, Wanli
    He, Fenglin
    FRONTIERS OF MANUFACTURING AND DESIGN SCIENCE, PTS 1-4, 2011, 44-47 : 383 - 387
  • [6] FPGA vs. Multi-core CPUs vs. GPUs: Hands-On Experience with a Sorting Application
    Grozea, Cristian
    Bankovic, Zorana
    Laskov, Pavel
    FACING THE MULTICORE-CHALLENGE: ASPECTS OF NEW PARADIGMS AND TECHNOLOGIES IN PARALLEL COMPUTING, 2010, 6310 : 105 - +
  • [7] Efficient Implementation of XPath Processoron Multi-Core CPUs
    Krulis, Martin
    Yaghob, Jakub
    PROCEEDINGS OF THE DATESO 2010 WORKSHOP - DATESO DATABASES, TEXTS, SPECIFICATIONS, AND OBJECTS, 2010, 567 : 60 - 71
  • [8] MCUDA: An Efficient Implementation of CUDA Kernels for Multi-core CPUs
    Stratton, John A.
    Stone, Sam S.
    Hwu, Wen-mei W.
    LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, 2008, 5335 : 16 - +
  • [9] Parallel Implementation of External Sort and Join Operations on a Multi-core Network-Optimized System on a Chip
    Khorasani, Elahe
    Paulovicks, Brent D.
    Sheinin, Vadim
    Yeo, Hangu
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, PT I: ICA3PP 2011, 2011, 7916 : 318 - 325
  • [10] Main-Memory Hash Joins on Multi-Core CPUs: Tuning to the Underlying Hardware
    Balkesen, Cagri
    Teubner, Jens
    Alonso, Gustavo
    Oezsu, M. Tamer
    2013 IEEE 29TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2013, : 362 - 373