Distributed processing of regular path queries in RDF graphs

被引:0
|
作者
Xintong Guo
Hong Gao
Zhaonian Zou
机构
[1] Harbin Institute of Technology,
来源
关键词
Knowledge graph; RDF/SPARQL; Regular path queries; Graph summarization; Graph partitioning;
D O I
暂无
中图分类号
学科分类号
摘要
SPARQL 1.1 offers a type of navigational query for RDF systems, called regular path query (RPQ). A regular path query allows for retrieving node pairs with the paths between them satisfying regular expressions. Regular path queries are always difficult to be evaluated efficiently because of the possible large search space. Thus there has been no scalable and practical solution so far. In this paper, we present Leon+, an in-memory distributed framework, to address the RPQ problem in the context of the knowledge graph. To reduce search space and mitigate mounting communication costs, Leon+ takes advantage of join-ahead pruning via a novel RDF summarization technique together with a path partitioning strategy. We also develop a subtle cost model to devise query plans to achieve high efficiency for complex RPQs. As there has been no available RPQ benchmark, we create micro-benchmarks on both synthetic and real-world datasets. A thorough experimental evaluation is presented between our approach and the state-of-the-art RDF stores. The results show that our approach outperforms 5x faster than the competitors on single RPQ. For query workload, it saves up to 1/2 time and 2/3 communication overheads over the baseline method.
引用
收藏
页码:993 / 1027
页数:34
相关论文
共 50 条
  • [31] Preferential Regular Path Queries
    Grahne, Goesta
    Thomo, Alex
    Wadge, William W.
    [J]. FUNDAMENTA INFORMATICAE, 2008, 89 (2-3) : 259 - 288
  • [32] Reasoning on regular path queries
    Calvanese, D
    De Giacomo, G
    Lenzerini, M
    Vardi, MY
    [J]. SIGMOD RECORD, 2003, 32 (04) : 83 - 92
  • [33] Solving regular path queries
    Liu, YA
    Yu, FX
    [J]. MATHEMATICS OF PROGRAM CONSTRUCTION, 2002, 2386 : 195 - 208
  • [34] TraPath: Fast Regular Path Query Evaluation on Large-Scale RDF Graphs
    Wang, Xin
    Rao, Guozheng
    Jiang, Longxiang
    Lyu, Xuedong
    Yang, Yajun
    Feng, Zhiyong
    [J]. WEB-AGE INFORMATION MANAGEMENT, WAIM 2014, 2014, 8485 : 372 - 383
  • [35] Efficient distributed path computation on RDF knowledge graphs using partial evaluation
    Mehmood, Qaiser
    Saleem, Muhammad
    Jha, Alokkumar
    D'Aquin, Mathieu
    [J]. WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2022, 25 (02): : 1005 - 1036
  • [36] Rewriting of regular expressions and regular path queries
    Calvanese, D
    De Giacomo, G
    Lenzerini, M
    Vardi, MY
    [J]. JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2002, 64 (03) : 443 - 465
  • [37] Towards Parallel Processing of RDF Queries in DHTs
    Lohrmann, Bjoern
    Battre, Dominic
    Kao, Odej
    [J]. DATA MANAGEMENT IN GRID AND PEER-TO-PEER SYSTEMS, PROCEEDINGS, 2009, 5697 : 36 - 47
  • [38] Efficient Parallel Processing of Distance Join Queries Over Distributed Graphs
    Zhang, Xiaofei
    Chen, Lei
    Wang, Min
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (03) : 740 - 754
  • [39] The path index for query processing on RDF and RDF schema
    Kim, YH
    Kim, BG
    Lee, J
    Lim, HC
    [J]. 7TH INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY, VOLS 1 AND 2, PROCEEDINGS, 2005, : 1237 - 1240
  • [40] Ganite: A distributed engine for scalable path queries over temporal property graphs
    Ramesh, Shriram
    Baranawal, Animesh
    Simmhan, Yogesh
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2021, 151 : 94 - 111