Parallelizing Query Optimization on Shared-Nothing Architectures

被引:0
|
作者
Trummer, Immanuel [1 ]
Koch, Christoph [1 ]
机构
[1] Ecole Polytech Fed Lausanne, CH-1015 Lausanne, Switzerland
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2016年 / 9卷 / 09期
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data processing systems offer an ever increasing degree of parallelism on the levels of cores, CPUs, and processing nodes. Query optimization must exploit high degrees of parallelism in order not to gradually become the bottleneck of query evaluation. We show how to parallelize query optimization at a massive scale. We present algorithms for parallel query optimization in left-deep and bushy plan spaces. At optimization start, we divide the plan space for a given query into partitions of equal size that are explored in parallel by worker nodes. At the end of optimization, each worker returns the optimal plan in its partition to the master which determines the globally optimal plan from the partition-optimal plans. No synchronization or data exchange is required during the actual optimization phase. The amount of data sent over the network, at the start and at the end of optimization, as well as the complexity of serial steps within our algorithms increase only linearly in the number of workers and in the query size. The time and space complexity of optimization within one partition decreases uniformly in the number of workers. We parallelize single- and multi-objective query optimization over a cluster with 100 nodes in our experiments, using more than 250 concurrent worker threads (Spark executors). Despite high network latency and task assignment overheads, parallelization yields speedups of up to one order of magnitude for large queries whose optimization takes minutes on a single node.
引用
下载
收藏
页码:660 / 671
页数:12
相关论文
共 50 条
  • [1] Query optimization techniques of a shared-nothing parallel database system
    Wen, Jirong
    Chen, Hong
    Wang, Shan
    Jisuanji Xuebao/Chinese Journal of Computers, 2000, 23 (01): : 28 - 38
  • [2] Parallelizing Windowed Stream Joins in a Shared-Nothing Cluster
    Chakraborty, Abhirup
    Singh, Ajit
    2013 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2013,
  • [3] A LOAD-BALANCED PARALLEL SORTING ALGORITHM FOR SHARED-NOTHING ARCHITECTURES
    KUMAR, A
    LEE, TT
    TSOTRAS, VJ
    DISTRIBUTED AND PARALLEL DATABASES, 1995, 3 (01) : 37 - 68
  • [4] Managing statistical behavior of large data sets in shared-nothing architectures
    Rigoutsos, I
    Delis, A
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1998, 9 (11) : 1073 - 1087
  • [5] Evaluation of some optimisation techniques for semantic query answering on shared-nothing architecture
    Wlodarczyk, Tomasz Wiktor
    Yi, Han
    Yu, Xiao
    Rong, Chunming
    INTERNATIONAL JOURNAL OF SPACE-BASED AND SITUATED COMPUTING, 2012, 2 (01) : 12 - 22
  • [6] Optimization of Multi-Join Queries in Shared-Nothing Systems
    Kian-Lee Tan(Department of Information Systems and Computer Science
    Journal of Computer Science & Technology, 1995, (02) : 149 - 162
  • [7] On the Use of Shared Storage in Shared-Nothing Environments
    Krish, K. R.
    Khasymski, Aleksandr
    Wang, Guanying
    Butt, Ali R.
    Makkar, Gaurav
    2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
  • [8] SQL-on-Hadoop: Full Circle Back to Shared-Nothing Database Architectures
    Floratou, Avrilia
    Minhas, Umar Farooq
    Ozcan, Fatma
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2014, 7 (12): : 1295 - 1306
  • [9] Decoupling load-balancing and optimization issues: A two-phase query processing framework for shared-nothing systems
    Tan, KL
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 1997, 12 (01): : 25 - 36
  • [10] Communication cost minimization for SQL query mapping onto a shared-nothing multiprocessor architecture
    Bonneau, S
    Hameurlain, A
    INTERNATIONAL SOCIETY FOR COMPUTERS AND THEIR APPLICATIONS 10TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING SYSTEMS, 1997, : 89 - 93