SigMR: MapReduce-based SPARQL query processing by signature encoding and multi-way join

被引:0
|
作者
Jinhyun Ahn
Dong-Hyuk Im
Hong-Gee Kim
机构
[1] Seoul National University,Biomedical Knowledge Engineering Laboratory, Dental Research Institute
[2] Hoseo University,Department of Computer and Information Engineering
[3] Seoul National University,Institute of Human
来源
关键词
Hadoop; MapReduce; Multi-way join; Signature encoding; SigMR; SPARQL;
D O I
暂无
中图分类号
学科分类号
摘要
Large numbers of Resource Description Framework triples are available in Linked Data which can grow exponentially. It makes SPARQL query processing engines infeasible on a single machine. To address this scalability issue, MapReduce framework-based SPARQL engines have been proposed, but we note that these methods are limited in terms of join evaluations. The two-way join-based approach evaluates joins via a sequence of binary multiplications that require multiple MapReduce jobs, which involves costly disk accesses between MapReduce jobs. The multi-way join-based approach combines multiple two-way join operations, which allows the simultaneous evaluation of joins during one MapReduce job. However, the size of data for the MapReduce job might increase exponentially if a complex query is given. In this study, we propose SigMR, a pruning method for multi-way join-based SPARQL query processing in MapReduce. In the proposed approach, a SPARQL query can be evaluated in a single MapReduce job, where the size of data is reduced dramatically by pruning based on our signature encoding technique, thereby overcoming the weaknesses of the previous approaches. In experiments, we showed that the query processing time required was lower with our approach than existing MapReduce-based methods.
引用
收藏
页码:3695 / 3725
页数:30
相关论文
共 22 条
  • [1] SigMR: MapReduce-based SPARQL query processing by signature encoding and multi-way join
    Ahn, Jinhyun
    Im, Dong-Hyuk
    Kim, Hong-Gee
    [J]. JOURNAL OF SUPERCOMPUTING, 2015, 71 (10): : 3695 - 3725
  • [2] Efficient Multi-way Theta-Join Processing Using MapReduce
    Zhang, Xiaofei
    Chen, Lei
    Wang, Min
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (11): : 1184 - 1195
  • [3] An algorithm for multi-way distance join query
    Liang, Yin
    Zhang, Hong
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-6, PROCEEDINGS, 2006, : 412 - +
  • [4] Two MRJs']Js for Multi-way Theta-Join in MapReduce
    Yan, Ke
    Zhu, Hong
    [J]. INTERNET AND DISTRIBUTED COMPUTING SYSTEMS, IDCS 2013, 2013, 8223 : 321 - 332
  • [5] A Survey of Traditional and MapReduce-Based Spatial Query Processing Approaches
    Singh, Hari
    Bawa, Seema
    [J]. SIGMOD RECORD, 2017, 46 (02) : 18 - 29
  • [6] Query processing of multi-way stream window joins
    Moustafa A. Hammad
    Walid G. Aref
    Ahmed K. Elmagarmid
    [J]. The VLDB Journal, 2008, 17 : 469 - 488
  • [7] Query processing of multi-way stream window joins
    Hammad, Moustafa A.
    Aref, Walid G.
    Elmagarmid, Ahmed K.
    [J]. VLDB JOURNAL, 2008, 17 (03): : 469 - 488
  • [8] A Scalable Sparse Matrix-Based Join for SPARQL Query Processing
    Zhang, Xiaowang
    Zhang, Mingyue
    Peng, Peng
    Song, Jiaming
    Feng, Zhiyong
    Zou, Lei
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 2019, 11448 : 510 - 514
  • [9] HyMJ: A Hybrid Structure Aware Approach to Distributed Multi-Way Join Query
    Zhu, Guanghui
    Wu, Xiaoqi
    Yin, Liangliang
    Wang, Haogang
    Gu, Rong
    Yuan, Chunfeng
    Huang, Yihua
    [J]. 2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019), 2019, : 1726 - 1729
  • [10] Generalized bitmap indexes for multi-way equijoin query processing
    Scott, K
    Perrizo, W
    Zou, QH
    [J]. PARALLEL AND DISTRIBUTED COMPUTING SYSTEMS, 2000, : 542 - 547