An efficient and scalable SPARQL query processing framework for big data using MapReduce and hybrid optimum load balancing

被引:0
|
作者
Kumar, V. Naveen [1 ]
Kumar, P. S. Ashok [2 ]
机构
[1] Visvesvaraya Technol Univ, Don Bosco Inst Technol, Bengaluru 560074, Karnataka, India
[2] Visvesvaraya Technol Univ, ACS Coll Engn, Dept CSE, Bengaluru 560074, Karnataka, India
关键词
RDF data storage; SPARQL querying; Hadoop; Extended vertical partitioning; Hybrid optimum load balancing; RDF DATA;
D O I
10.1016/j.datak.2023.102239
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The increasing RDF (Resource Description Framework) data volume requires a Hadoop platform for processing queries over large datasets. In this work, SPARQL (Simple Protocol and Rdf Query Language) queries are evaluated with Hadoop based on the objective of minimizing the number of joins through data partitioning for performing map/reduce jobs. The query evaluation time and the number of cross node joins are minimized with the proposed partitioning techniques. Extended vertical partitioning is proposed for distributed data stores based on objects' explicit information for splitting predicates. For accessing the RDF data, hybrid monarch butterfly with beetle swarm load balancing optimization with Map-reduce (Hybrid Optimum Load Balancing) is applied. The proposed SPARQL query processing is evaluated over large RDF datasets. The proposed approach's evaluation results are analyzed with the existing approaches, indicating the proposed framework's efficiency. By using the proposed approach, an accuracy of 97 % is obtained.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Scalable Query Optimization for Efficient Data Processing using MapReduce
    Shan, Yi
    Chen, Yi
    [J]. 2015 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2015, 2015, : 649 - 652
  • [2] Efficient and Scalable SPARQL Query Processing with Transformed Table
    Huang, Sheng-Wei
    Yu, Chia-Ho
    Shieh, Ce-Kuen
    Tsai, Ming-Fong
    [J]. 2015 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE WORKSHOPS (WCNCW), 2015, : 103 - 106
  • [3] Towards efficient SPARQL query processing on RDF data
    Liu C.
    Wang H.
    Yu Y.
    Xu L.
    [J]. Tsinghua Science and Technology, 2010, 15 (06) : 613 - 622
  • [4] Research on Efficient SPARQL Query Processing for RDF Data
    Zhang, Yi
    [J]. PROCEEDINGS OF THE 2015 2ND INTERNATIONAL WORKSHOP ON MATERIALS ENGINEERING AND COMPUTER SCIENCES (IWMECS 2015), 2015, 33 : 476 - 482
  • [5] Towards Efficient SPARQL Query Processing on RDF Data
    刘畅
    王昊奋
    俞勇
    徐林昊
    [J]. Tsinghua Science and Technology, 2010, 15 (06) : 613 - 622
  • [6] Efficient Processing of Area Skyline Query in MapReduce Framework
    Choudhury, Zakia Zinat
    Zaman, Asif
    Hamid, Md Ekramul
    [J]. 2018 4TH IEEE INTERNATIONAL WIE CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (IEEE WIECON-ECE 2018), 2018, : 79 - 82
  • [7] MapReduce Based Scalable Range Query Architecture for Big Spatial Data
    Eken, Suleyman
    Kizgindere, Umut
    Sayar, Ahmet
    [J]. RISE OF BIG SPATIAL DATA, 2017, : 263 - 272
  • [8] MapReduce Based Scalable Range Query Architecture for Big Spatial Data
    Kizgindere, Umut
    Eken, Suleyman
    Sayar, Ahmet
    [J]. 2015 IEEE/ACS 12TH INTERNATIONAL CONFERENCE OF COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2015,
  • [9] Efficient Big Data Processing in Hadoop MapReduce
    Dittrich, Jens
    Quiane-Ruiz, Jorge-Arnulfo
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (12): : 2014 - 2015
  • [10] Load balancing in reducers for skewed data in MapReduce systems by using scalable simple random sampling
    Elaheh Gavagsaz
    Ali Rezaee
    Hamid Haj Seyyed Javadi
    [J]. The Journal of Supercomputing, 2018, 74 : 3415 - 3440