Distributed RDF Archives Querying with Spark

被引:0
|
作者
Bahri, Afef [1 ]
Laajimi, Meriem [2 ]
Ayadi, Nadia Yacoubi [3 ]
机构
[1] Univ Sfax, MIRACL Lab, Sfax, Tunisia
[2] High Inst Management Tunis, Tunis, Tunisia
[3] Univ Manouba, ENSI, RIADI Res Lab, Manouba 2010, Tunisia
来源
关键词
RDF archives; Distributed systems; Versioning queries; SPARQL; SPARK; SPARK SQL;
D O I
10.1007/978-3-319-98192-5_59
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The prevalence of open data and the expansion of published information on the web have engendered a large scale of available RDF data. When dealing with the evolution of the published datasets, users may need to access to not only the actual version of a dataset but equally the previous ones and would like to track the evolution of data over time. To this direction, single-machine RDF archiving systems and Benchmarks have been proposed but do not scale well to query large RDF archives. Distributed data management systems present a promising direction for providing scalability and parallel processing of large volume of RDF data. In this paper, we study and compare commonly used RDF archiving techniques and querying strategies with the distributed computing platform Spark. We propose a formal mapping of versioning queries defined with SPARQL into SQL SPARK. We make a series of experimentation of these queries to study the effects of RDF archives partitioning and distribution.
引用
收藏
页码:451 / 465
页数:15
相关论文
共 50 条
  • [1] Data Partitioning Scheme for Efficient Distributed RDF Querying Using Apache Spark
    Hassan, Mahmudul
    Bansal, Srividya K.
    2019 13TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2019, : 24 - 31
  • [2] S2RDF: RDF Querying with SPARQL on Spark
    Schaetzle, Alexander
    Przyjaciel-Zablocki, Martin
    Skilevic, Simon
    Lausen, Georg
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2016, 9 (10): : 804 - 815
  • [3] Semantic Querying Big and Distributed RDF Data
    Kaoutar, Lamrani
    Abderrahim, Ghadi
    Kudagba, Florent Kunale
    PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON SMART CITY APPLICATIONS (SCA'18), 2018,
  • [4] Querying distributed RDF data sources with SPARQL
    Quilitz, Bastian
    Leser, Ulf
    SEMANTIC WEB: RESEARCH AND APPLICATIONS, PROCEEDINGS, 2008, 5021 : 524 - 538
  • [5] Triple storage for random-access versioned querying of RDF archives
    Taelman, Ruben
    Sande, Vander
    Van Herwegen, Joachim
    Mannens, Erik
    Verborgh, Ruben
    JOURNAL OF WEB SEMANTICS, 2019, 54 : 4 - 28
  • [6] A Distributed Query Method for RDF Data on Spark
    Guo, Minru
    Wang, Jingbin
    BIG DATA TECHNOLOGY AND APPLICATIONS, 2016, 590 : 102 - 115
  • [7] QPPDs: Querying Property Paths Over Distributed RDF Datasets
    Mehmood, Qaiser
    Saleem, Muhammad
    Sahay, Ratnesh
    Ngomo, Axel-Cyrille Ngonga
    D'Aquin, Mathieu
    IEEE ACCESS, 2019, 7 : 101031 - 101045
  • [8] Toward sustainable publishing and querying of distributed Linked Data archives
    Vander Sande, Miel
    Verborgh, Ruben
    Hochstenbach, Patrick
    Van de Sompel, Herbert
    JOURNAL OF DOCUMENTATION, 2018, 74 (01) : 195 - 222
  • [9] RAL: An Algebra for Querying RDF
    Flavius Frasincar
    Geert-Jan Houben
    Richard Vdovjak
    Peter Barna
    World Wide Web, 2004, 7 : 83 - 109
  • [10] RAL: An algebra for querying RDF
    Frasincar, F
    Houben, GJ
    Vdovjak, R
    Barna, P
    WISE 2002: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS ENGINEERING, 2002, : 173 - 181