DATA-FLOW QUERY EXECUTION IN A PARALLEL MAIN-MEMORY ENVIRONMENT

被引:38
|
作者
WILSCHUT, AN
APERS, PMG
机构
[1] University of Twente, AE Enschede, 7500
[2] University of Twente, AE Enschede, 7500
关键词
PARALLEL QUERY PROCESSING; MULTI-JOIN QUERIES; SIMULATION; ANALYTICAL MODELING;
D O I
10.1007/BF01277522
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, the performance and characteristics of the execution of various join-trees on a parallel DBMS are studied. The results of this study are a step into the direction of the design of a query optimization strategy that is fit for parallel execution of complex queries. Among others, synchronization issues are identified to limit the performance gain from parallelism. A new hash-join algorithm is introduced that has fewer synchronization constraints than the known hash-join algorithms. Also, the behavior of individual join operations in a join-tree is studied in a simulation experiment. The results show that the introduced Pipelining hash-join algorithm yields a better performance for multi-join queries. The format of the optimal join-tree appears to depend on the size of the operands of the join: A multi-join between small operands performs best with a bushy schedule; larger operands are better off with a linear schedule. The results from the simulation study are confirmed with an analytic model for dataflow query execution.
引用
收藏
页码:103 / 128
页数:26
相关论文
共 50 条
  • [1] DATA-FLOW BASED EXECUTION MECHANISMS OF PARALLEL AND CONCURRENT PROLOG
    ITO, N
    SHIMIZU, H
    KISHI, M
    KUNO, E
    ROKUSAWA, K
    [J]. NEW GENERATION COMPUTING, 1985, 3 (01) : 15 - 41
  • [2] MQJoin: Efficient Shared Execution of Main-Memory Joins
    Makreshanski, Darko
    Giannikis, Georgios
    Alonso, Gustavo
    Kossmann, Donald
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2016, 9 (06): : 480 - 491
  • [3] Optimization of data flow execution in a parallel environment
    Georgia Kougka
    Anastasios Gounaris
    [J]. Distributed and Parallel Databases, 2019, 37 : 385 - 410
  • [4] Optimization of data flow execution in a parallel environment
    Kougka, Georgia
    Gounaris, Anastasios
    [J]. DISTRIBUTED AND PARALLEL DATABASES, 2019, 37 (03) : 385 - 410
  • [5] Modeling Data Flow Execution in a Parallel Environment
    Kougka, Georgia
    Gounaris, Anastasios
    Leser, Ulf
    [J]. BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY, DAWAK 2017, 2017, 10440 : 183 - 196
  • [6] Fast OLAP Query Execution in Main Memory on Large Data in a Cluster
    Weidner, Martin
    Dees, Jonathan
    Sanders, Peter
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
  • [7] Memory Bounds for the Distributed Execution of a Hierarchical Synchronous Data-Flow Graph
    Desnos, Karol
    Pelcat, Maxime
    Nezan, Jean-Francois
    Aridhi, Slaheddine
    [J]. 2012 INTERNATIONAL CONFERENCE ON EMBEDDED COMPUTER SYSTEMS (SAMOS): ARCHITECTURES, MODELING AND SIMULATION, 2012, : 160 - 167
  • [8] BLOCK: Efficient Execution of Spatial Range Queries in Main-Memory
    Olma, Matthaios
    Tauheed, Farhan
    Heinis, Thomas
    Ailamaki, Anastasia
    [J]. SSDBM 2017: 29TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, 2017,
  • [9] Adaptive Data Skipping in Main-Memory Systems
    Qin, Wilson
    Idreos, Stratos
    [J]. SIGMOD'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2016, : 2255 - 2256
  • [10] Supporting design patterns in a visual parallel data-flow programming environment
    Toyoda, M
    Shizuki, B
    Takahashi, S
    Matsuoka, S
    Shibayama, E
    [J]. 1997 IEEE SYMPOSIUM ON VISUAL LANGUAGES, PROCEEDINGS, 1997, : 76 - 83