Distributed Pregel-based provenance-aware regular path query processing on RDF knowledge graphs

被引:0
|
作者
Xin Wang
Simiao Wang
Yueqi Xin
Yajun Yang
Jianxin Li
Xiaofei Wang
机构
[1] Tianjin University,College of Intelligence and Computing
[2] Tianjin Key Laboratory of Cognitive Computing and Application,School of Information Technology
[3] Deakin University,undefined
来源
World Wide Web | 2020年 / 23卷
关键词
Regular path query; Provenance-aware; RDF graph; Pregel;
D O I
暂无
中图分类号
学科分类号
摘要
With the proliferation of knowledge graphs, massive RDF graphs have been published on the Web. As an essential type of queries for RDF graphs, Regular Path Queries (RPQs) have been attracting increasing research efforts. However, the existing query processing approaches mainly focus on RPQs under the standard semantics, which cannot provide the provenance of the answer sets. We propose a distributed Pregel-based approach DP2RPQ to evaluating provenance-aware RPQs over big RDF graphs. Our method employs Glushkov automata to keep track of matching processes of RPQs in parallel. Meanwhile, three optimization strategies are devised according to the cost model, including vertex-computation optimization, message-communication reduction, and counting-paths alleviation, which can reduce the intermediate results of the basic DP2RPQ algorithm dramatically and overcome the counting-paths problem to some extent. The proposed algorithms are verified by extensive experiments on both synthetic and real-world datasets, which show that our approach can efficiently answer the provenance-aware RPQs over large RDF graphs. Furthermore, the RPQ semantics of DP2RPQ is richer than that of RDFPath, and the performance of DP2RPQ is still far better than that of RDFPath.
引用
收藏
页码:1465 / 1496
页数:31
相关论文
共 20 条
  • [1] Distributed Pregel-based provenance-aware regular path query processing on RDF knowledge graphs
    Wang, Xin
    Wang, Simiao
    Xin, Yueqi
    Yang, Yajun
    Li, Jianxin
    Wang, Xiaofei
    [J]. WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2020, 23 (03): : 1465 - 1496
  • [2] P3RPQ: Pregel-Based Parallel Provenance-Aware Regular Path Query Processing on Large RDF Graphs
    Xin, Yueqi
    Zhang, Bingyi
    Wang, Xin
    Xu, Qiang
    Feng, Zhiyong
    [J]. COMPANION PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2018 (WWW 2018), 2018, : 19 - 20
  • [3] Distributed Efficient Provenance-Aware Regular Path Queries on Large RDF Graphs
    Xin, Yueqi
    Wang, Xin
    Jin, Di
    Wang, Simiao
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2018, PT I, 2018, 10827 : 766 - 782
  • [4] ProvRPQ: An Interactive Tool for Provenance-Aware Regular Path Queries on RDF Graphs
    Wang, Xin
    Wang, Junhu
    [J]. DATABASES THEORY AND APPLICATIONS, (ADC 2016), 2016, 9877 : 480 - 484
  • [5] Answering Provenance-Aware Regular Path Queries on RDF Graphs Using an Automata-Based Algorithm
    Wang, Xin
    Ling, Jun
    Wang, Junhu
    Wang, Kewen
    Feng, Zhiyong
    [J]. WWW'14 COMPANION: PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2014, : 395 - 396
  • [6] PDSM: Pregel-Based Distributed Subgraph Matching on Large Scale RDF Graphs
    Xu, Qiang
    Wang, Xin
    Xin, Yueqi
    Feng, Zhiyong
    Chen, Renhai
    [J]. COMPANION PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2018 (WWW 2018), 2018, : 17 - 18
  • [7] Distributed processing of regular path queries in RDF graphs
    Guo, Xintong
    Gao, Hong
    Zou, Zhaonian
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2021, 63 (04) : 993 - 1027
  • [8] Distributed processing of regular path queries in RDF graphs
    Xintong Guo
    Hong Gao
    Zhaonian Zou
    [J]. Knowledge and Information Systems, 2021, 63 : 993 - 1027
  • [9] Adaptive RDF Query Processing Based on Provenance
    Wylot, Marcin
    Cudre-Mauroux, Philippe
    Groth, Paul
    [J]. PROVENANCE AND ANNOTATION OF DATA AND PROCESSES (IPAW 2014), 2015, 8628 : 264 - 266
  • [10] Provenance-Aware Knowledge Representation: A Survey of Data Models and Contextualized Knowledge Graphs
    Sikos, Leslie F.
    Philp, Dean
    [J]. DATA SCIENCE AND ENGINEERING, 2020, 5 (03) : 293 - 316