In-memory parallelization of join queries over large ontological hierarchies

被引:0
|
作者
Dimitris Bilidas
Manolis Koubarakis
机构
[1] National and Kapodistrian University of Athens,
来源
关键词
RDF; SPARQL; OWL; Join processing;
D O I
暂无
中图分类号
学科分类号
摘要
The Resource Description Framework (RDF) data model enables the construction of knowledge graphs over various domains, using ontologies in order to encode information about the domain, and simple statements in the form of subject-predicate-object triples for data representation, facilitating the interlinking and exchange of Web data. However, this simplicity comes with the cost of having to execute a large number of joins in order to get the desirable query results, while at the same time large ontological hierarchies complicate the query answering process even more, for systems that provide complete answers with respect to such ontological axioms. In this work we present PARJ, an in-memory RDF store which takes into consideration ontological hierarchies during join processing with very low performance overhead, avoiding expensive preprocessing and materialization of implications, and is also amenable to straightforward parallelization. Specifically, we present a join implementation that allows to achieve any desired degree of parallelism on arbitrary join queries and RDF graphs stored in memory using compact vertical partitioning. We use an adaptive join processing approach, such that we take advantage of complete or even partial ordering of RDF data, which is compactly stored in order to increase spatial locality and keep memory consumption low, coupled with an ID-to-Position vector index used when ordering does not allow for efficient scanning of the input relation. Finally, we experimentally show the efficiency and scalability of our proposal.
引用
收藏
页码:545 / 582
页数:37
相关论文
共 50 条
  • [1] In-memory parallelization of join queries over large ontological hierarchies
    Bilidas, Dimitris
    Koubarakis, Manolis
    [J]. DISTRIBUTED AND PARALLEL DATABASES, 2021, 39 (03) : 545 - 582
  • [2] PARALLELIZATION FOR MULTIPROCESSORS WITH MEMORY HIERARCHIES
    GERNDT, M
    MORITSCH, H
    [J]. LECTURE NOTES IN COMPUTER SCIENCE, 1992, 591 : 89 - 101
  • [3] Adaptive Optimization of Very Large Join Queries
    Neumann, Thomas
    Radke, Bernhard
    [J]. SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2018, : 677 - 692
  • [4] Optimizing large join queries in mediation systems
    Yerneni, R
    Li, C
    Ullman, J
    Garcia-Molina, H
    [J]. DATABASE THEORY - ICDT'99, 1999, 1540 : 348 - 364
  • [5] Compiled Plans for In-Memory Path-Counting Queries
    Myers, Brandon
    Hyrkas, Jeremy
    Halperin, Daniel
    Howe, Bill
    [J]. IN MEMORY DATA MANAGEMENT AND ANALYSIS, 2015, 8921 : 28 - 43
  • [6] CPU and incremental memory allocation in dynamic parallelization of SQL queries
    Hameurlain, A
    Morvan, F
    [J]. PARALLEL COMPUTING, 2002, 28 (04) : 525 - 556
  • [7] Efficient Massively Parallel Join Optimization for Large Queries
    Mancini, Riccardo
    Karthik, Srinivas
    Chandra, Bikash
    Mageirakos, Vasilis
    Ailamaki, Anastasia
    [J]. PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD '22), 2022, : 122 - 135
  • [8] Generating Join Queries for Large Databases and Web Services
    Bagui, Sikha
    Loggins, Adam
    [J]. INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY AND WEB ENGINEERING, 2009, 4 (02) : 45 - 60
  • [9] Distributed In-Memory Trajectory Similarity Search and Join on Road Network
    Yuan, Haitao
    Li, Guoliang
    [J]. 2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019), 2019, : 1262 - 1273
  • [10] Parallel In-Memory Trajectory-based Spatiotemporal Topological Join
    Ray, Suprio
    Brown, Angela Demke
    Koudas, Nick
    Blanco, Rolando
    Goel, Anil K.
    [J]. PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 361 - 370