HyMJ: A Hybrid Structure Aware Approach to Distributed Multi-Way Join Query

被引:1
|
作者
Zhu, Guanghui [1 ]
Wu, Xiaoqi [1 ]
Yin, Liangliang [1 ]
Wang, Haogang [1 ]
Gu, Rong [1 ]
Yuan, Chunfeng [1 ]
Huang, Yihua [1 ]
机构
[1] Nanjing Univ, Collaborat Innovat Ctr Novel Software Technol & I, Natl Key Lab Novel Software Technol, Nanjing 210023, Jiangsu, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
multi-way join; distributed computing; parallel query; Apache Spark;
D O I
10.1109/ICDE.2019.00183
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The multi-way join query plays a fundamental role in many big data analytic scenarios. Recently, the hybrid join query is becoming increasingly important. However, the existing one-round and multi-round algorithms have limitations in the process of the hybrid query. In this paper, we present a novel hybrid structure-aware multi-way join algorithm called HyMJ, which combines the one-round and multi-round algorithms to compute the hybrid query efficiently. First, we propose the query structure graph (QSG) to represent the internal query structure of a given join query and the query structure decomposition tree (QSDT) to represent the structure-aware query plan. Each internal node of the QSDT denotes a subquery with a cyclic or acyclic query structure. Then, we design a graph contraction based algorithm to construct QSDT from QSG. Furthermore, to select the optimal join strategy for each subquery in the QSDT, we introduce a heuristic strategy selection model. Experimental results on Apache Spark reveal that HyMJ outperforms both the one-round and multi-round algorithms for hybrid multi-way join queries on real-world datasets.
引用
收藏
页码:1726 / 1729
页数:4
相关论文
共 50 条
  • [1] An algorithm for multi-way distance join query
    Liang, Yin
    Zhang, Hong
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-6, PROCEEDINGS, 2006, : 412 - +
  • [2] AutoMJ: Towards Efficient Multi-way Join Query on Distributed Data-parallel Platform
    Zhu, Guanghui
    Wu, Xiaoqi
    Gu, Rong
    Yuan, Chunfeng
    Huang, Yihua
    [J]. 2017 IEEE 23RD INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2017, : 161 - 169
  • [3] Towards Multi-way Join Aware Optimizer in SAP HANA
    Wi, Sungheun
    Han, Wook-Shin
    Chang, Chuho
    Kim, Kihong
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 13 (12): : 3019 - 3031
  • [4] Towards Multi-way Join Aware Optimizer in SAP HANA
    Wi, Sungheun
    Han, Wook-Shin
    Chang, Chuho
    Kim, Kihong
    [J]. Proceedings of the VLDB Endowment, 2020, 13 (12): : 3019 - 3031
  • [5] Distributed Spatial Join Processing for Multiple Spatial Datasets - Multi-way Spatial Join
    Cunha, Anderson R.
    de Oliveira, Savio S. T.
    de Oliveira, Thiago B.
    Aleixo, Everton L.
    Cardoso, Marcelo de C.
    do Sacramento Rodrigues, Vagner J.
    [J]. 2015 XXXIII BRAZILIAN SYMPOSIUM ON COMPUTER NETWORKS AND DISTRIBUTED SYSTEMS, 2015, : 171 - 181
  • [6] Multi-way spatial join selectivity for the ring join graph
    Min, JK
    Park, HH
    Chung, CW
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2005, 47 (12) : 785 - 795
  • [7] SigMR: MapReduce-based SPARQL query processing by signature encoding and multi-way join
    Ahn, Jinhyun
    Im, Dong-Hyuk
    Kim, Hong-Gee
    [J]. JOURNAL OF SUPERCOMPUTING, 2015, 71 (10): : 3695 - 3725
  • [8] SigMR: MapReduce-based SPARQL query processing by signature encoding and multi-way join
    Jinhyun Ahn
    Dong-Hyuk Im
    Hong-Gee Kim
    [J]. The Journal of Supercomputing, 2015, 71 : 3695 - 3725
  • [9] Towards a Multi-way Similarity Join Operator
    Galkin, Mikhail
    Vidal, Maria-Esther
    Auer, Soeren
    [J]. NEW TRENDS IN DATABASES AND INFORMATION SYSTEMS, ADBIS 2017, 2017, 767 : 267 - 274
  • [10] Multi-way distance join queries in spatial databases
    Corral, A
    Manolopoulos, Y
    Theodoridis, Y
    Vassilakopoulos, M
    [J]. GEOINFORMATICA, 2004, 8 (04) : 373 - 402