Locality-aware Partitioning in Parallel Database Systems

被引:33
|
作者
Zamanian, Erfan [1 ]
Binnig, Carsten [1 ,2 ]
Salama, Abdallah [2 ]
机构
[1] Brown Univ, Providence, RI 02912 USA
[2] Baden Wuerttemberg Cooperat State Univ, Mannheim, Germany
关键词
D O I
10.1145/2723372.2723718
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Parallel database systems horizontally partition large amounts of structured data in order to provide parallel data processing capabilities for analytical workloads in shared-nothing clusters. One major challenge when horizontally partitioning large amounts of data is to reduce the network costs for a given workload and a database schema. A common technique to reduce the network costs in parallel database systems is to co-partition tables on their join key in order to avoid expensive remote join operations. However, existing partitioning schemes are limited in that respect since only subsets of tables in complex schemata sharing the same join key can be co-partitioned unless tables are fully replicated. In this paper we present a novel partitioning scheme called predicate-based reference partition (or PREF for short) that allows to co-partition sets of tables based on given join predicates. Moreover, based on PREF, we present two automatic partitioning design algorithms to maximize data-locality. One algorithm only needs the schema and data whereas the other algorithm additionally takes the workload as input. In our experiments we show that our automated design algorithms can partition database schemata of different complexity and thus help to effectively reduce the runtime of queries under a given workload when compared to existing partitioning approaches.
引用
收藏
页码:17 / 30
页数:14
相关论文
共 50 条
  • [1] Locality-aware task scheduling for homogeneous parallel computing systems
    Bhatti, Muhammad Khurram
    Oz, Isil
    Amin, Sarah
    Mushtaq, Maria
    Farooq, Umer
    Popov, Konstantin
    Brorsson, Mats
    [J]. COMPUTING, 2018, 100 (06) : 557 - 595
  • [2] Locality-aware task scheduling for homogeneous parallel computing systems
    Muhammad Khurram Bhatti
    Isil Oz
    Sarah Amin
    Maria Mushtaq
    Umer Farooq
    Konstantin Popov
    Mats Brorsson
    [J]. Computing, 2018, 100 : 557 - 595
  • [3] Locality-Aware Bank Partitioning for Shared DRAM MPSoCs
    Liu, Yangguo
    Lu, Junlin
    Tong, Dong
    Cheng, Xu
    [J]. 2017 22ND ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2017, : 770 - 775
  • [4] Locality-Aware Parallel Process Mapping for Multi-Core HPC Systems
    Hursey, Joshua
    Squyres, Jeffrey M.
    Dontje, Terry
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2011, : 527 - 531
  • [5] Locality-Aware Mapping of Nested Parallel Patterns on GPUs
    Lee, HyoukJoong
    Brown, Kevin J.
    Sujeeth, Arvind K.
    Rompf, Tiark
    Olukotun, Kunle
    [J]. 2014 47TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2014, : 63 - 74
  • [6] Locality-Aware Task-Parallel Execution on GPUs
    Hbeika, Jad
    Kulkarni, Milind
    [J]. LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, LCPC 2016, 2017, 10136 : 250 - 264
  • [7] Spatial Locality-Aware Cache Partitioning for Effective Cache Sharing
    Gupta, Saurabh
    Zhou, Huiyang
    [J]. 2015 44TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2015, : 150 - 159
  • [8] Locality-Aware Scheduling of Independent Tasks for Runtime Systems
    Gonthier, Maxime
    Marchal, Loris
    Thibault, Samuel
    [J]. EURO-PAR 2021: PARALLEL PROCESSING WORKSHOPS, 2022, 13098 : 5 - 16
  • [9] Replica-Aware Partitioning Design in Parallel Database Systems
    Dong, Liming
    Liu, Weidong
    Li, Renchuan
    Zhang, Tiejun
    Zhao, Weiguo
    [J]. EURO-PAR 2017: PARALLEL PROCESSING, 2017, 10417 : 303 - 316
  • [10] Locality-Aware Crowd Counting
    Zhou, Joey Tianyi
    Le Zhang
    Du Jiawei
    Xi Peng
    Fang, Zhiwen
    Zhe Xiao
    Zhu, Hongyuan
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (07) : 3602 - 3613