A Distributed Query Method for RDF Data on Spark

被引:0
|
作者
Guo, Minru [1 ]
Wang, Jingbin [1 ]
机构
[1] Fuzhou Univ, Coll Math & Comp Sci, Fuzhou 350108, Peoples R China
来源
关键词
Distributed; Spark; RDF; Index; Query;
D O I
10.1007/978-981-10-0457-5_11
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the upcoming data deluge of semantic data, the fast growth of RDF data has brought significant challenges in query. A new distributed RDF query algorithm RQCCP (RDF data Query combined with Classes Correlations with Property) on Spark platform is proposed to solve the problem of low efficiency for RDF data query. It splits and stores RDF data by the class of Subject, Predicate and the class of Object, simultaneously building index file of classes correlations with property; the index is applied to narrow the scope of input for query, filtering out irrelevant triples in advance and intermediate results of query cached in memory as resilient distributed dataset to reduce disk and network I/O. The results of experiments conducted on large-scale RDF datasets show that RQCCP has high query performance.
引用
收藏
页码:102 / 115
页数:14
相关论文
共 50 条
  • [41] A Development of RDF Data Transfer and Query on Hadoop Framework
    Kawises, Jutamard
    Vatanawood, Wiwat
    [J]. 2016 IEEE/ACIS 15TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS), 2016, : 217 - 220
  • [42] Spatiotemporal RDF Data Query Based on Subgraph Matching
    Meng, Xiangfu
    Zhu, Lin
    Li, Qing
    Zhang, Xiaoyan
    [J]. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2021, 10 (12)
  • [43] A Survey of Distributed RDF Data Management
    Zou L.
    Peng P.
    [J]. 2017, Science Press (54): : 1213 - 1224
  • [44] Linked Data Partitioning for RDF Processing on Apache Spark
    Atashkar, Amir Hossein
    Ghadiri, Nasser
    Joodaki, Mehdi
    [J]. 2017 3RD INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR), 2017, : 73 - 77
  • [45] Semantic similarity method for keyword query system on RDF
    Bae, Minho
    Kang, Sanggil
    Oh, Sangyoon
    [J]. NEUROCOMPUTING, 2014, 146 : 264 - 275
  • [46] Efficient Distributed Range Query Processing in Apache Spark
    Papadopoulos, Apostolos N.
    Sioutas, Spyros
    Zacharatos, Nikolaos
    Zaroliagis, Christos
    [J]. 2019 19TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2019, : 569 - 575
  • [47] DREAM: Distributed RDF Engine with Adaptive Query Planner and Minimal Communication
    Hammoud, Mohammad
    Rabbou, Dania Abed
    Nouri, Reza
    Beheshti, Seyed-Mehdi-Reza
    Sakr, Sherif
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2015, 8 (06): : 654 - 665
  • [48] Adaptive Distributed RDF Graph Fragmentation and Allocation based on Query Workload
    Peng, Peng
    Zou, Lei
    Chen, Lei
    Zhao, Dongyan
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2019, 31 (04) : 670 - 685
  • [49] Query-Driven Verification of Data Integration in the RDF Data Model
    Stupnikov S.A.
    [J]. Lobachevskii Journal of Mathematics, 2023, 44 (1) : 205 - 218
  • [50] RDF Data Query and Management Method based on HBase and Structure Index in Railway Sensor Application
    Yang, Menglun
    Zhang, Baopeng
    Li, Yidong
    [J]. 2013 INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES (PDCAT), 2013, : 36 - 43