Efficient Processing of SPARQL Queries Over GraphFrames

被引:2
|
作者
Bahrami, Ramazan Ali [1 ]
Gulati, Jayati [1 ]
Abulaish, Muhammad [1 ]
机构
[1] South Asian Univ, Dept Comp Sci, Delhi, India
关键词
Graph mining; Linked data mining; SPARQL query processing; GraphFrames; GraphX;
D O I
10.1145/3106426.3106534
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the advent of huge data management systems storing voluminous data, there arises a need to develop efficient data analytics techniques for knowledge discovery at different levels of granularity. Resource Description Framework (RDF), mainly developed for Semantic Web, is presumably a good option when considering graph databases dealing with huge real-world data. RDF models information in the form of triples <subject, predicate, object>, and is considered as a useful tool to store graph data (aka linked data) where each edge can be stored as a triple. Due to existence of huge amount of linked data, mostly in the form of graphs, graph mining has been successful in attracting researchers from different research fields for efficient handling (storage, indexing, retrieval, etc.) of graph data. As a result, various APIs like GraphX and GraphFrames are developed to facilitate relational queries over graph data. Though GraphX is older than GraphFrames and processing SPARQL queries over GraphX has been explored by some researchers, to the best of our knowledge, SPARQL query processing over GraphFrames has not been explored yet. In this paper, we present an initial study on query-specific search space pruning and query optimization approach to process SPARQL queries over GraphFrames in an efficient manner. The experimental results, in terms of low response time for query execution, are encouraging, and give way to invest more research efforts in this direction.
引用
收藏
页码:678 / 685
页数:8
相关论文
共 50 条
  • [1] Processing SPARQL queries over distributed RDF graphs
    Peng Peng
    Lei Zou
    M. Tamer Özsu
    Lei Chen
    Dongyan Zhao
    [J]. The VLDB Journal, 2016, 25 : 243 - 268
  • [2] Processing SPARQL queries over distributed RDF graphs
    Peng, Peng
    Zou, Lei
    Ozsu, M. Tamer
    Chen, Lei
    Zhao, Dongyan
    [J]. VLDB JOURNAL, 2016, 25 (02): : 243 - 268
  • [3] SPARQL Queries over Source Code
    Setzu, Mattia
    Atzori, Maurizio
    [J]. 2016 IEEE TENTH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2016, : 104 - 106
  • [4] Rewriting Complex SPARQL Analytical Queries for Efficient Cloud-based Processing
    Ravindra, Padmashree
    Kim, HyeongSik
    Anyanwu, Kemafor
    [J]. PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 32 - 37
  • [5] Federated SPARQL Queries Processing with Replicated Fragments
    Montoya, Gabriela
    Skaf-Molli, Hala
    Molli, Pascal
    Vidal, Maria-Esther
    [J]. SEMANTIC WEB - ISWC 2015, PT I, 2015, 9366 : 36 - 51
  • [6] Intermediate results processing for aggregated SPARQL queries
    Rabhi, Ahmed
    Fissoune, Rachida
    Tabaa, Mohamed
    Badir, Hassan
    [J]. 2021 IEEE/ACS 18TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2021,
  • [7] Processing Aggregate Queries in a Federation of SPARQL Endpoints
    Ibragimov, Dilshod
    Hose, Katja
    Pedersen, Torben Bach
    Zimanyi, Esteban
    [J]. SEMANTIC WEB: LATEST ADVANCES AND NEW DOMAINS, ESWC 2015, 2015, 9088 : 269 - 285
  • [8] Processing SPARQL Aggregate Queries with Web Preemption
    Grall, Arnaud
    Minier, Thomas
    Skaf-Molli, Hala
    Molli, Pascal
    [J]. SEMANTIC WEB (ESWC 2020), 2020, 12123 : 235 - 251
  • [9] Efficient Distributed SPARQL Queries on Apache Spark
    Albahli, Saleh
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (08) : 564 - 568
  • [10] Efficient distributed SPARQL queries on Apache Spark
    Albahli, Saleh
    [J]. International Journal of Advanced Computer Science and Applications, 2019, 10 (08): : 564 - 568