GoFast: Graph-based optimization for efficient and scalable query evaluation

被引:4
|
作者
Zouaghi, Ishaq [1 ,3 ]
Mesmoudi, Amin [2 ]
Galicia, Jorge [1 ]
Bellatreche, Ladjel [1 ]
Aguili, Taoufik [3 ]
机构
[1] LIAS ISAE ENSMA, Chasseneuil, France
[2] Univ Poitiers, LIAS, Poitiers, France
[3] LR SysCom ENIT UTM, Tunis, Tunisia
关键词
Optimization; RDF; SPARQL; Cardinality estimation; Cost model;
D O I
10.1016/j.is.2021.101738
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The popularity of the Resource Description Framework (RDF) and SPARQL has thrust the development of high-performance systems to manage data represented with this model. Former approaches adapted the well-established relational model applying its storage, query processing, and optimization strategies. However, the borrowed techniques from the relational model are not universally applicable in the RDF context. First, the schema-free nature of RDF induces intensive joins overheads. Also, optimization strategies trying to find the optimal join order rely on error-prone statistics unable to capture all the correlations among triples. Graph-based approaches keep the graph structure of RDF representing the data directly as a graph. Their execution model leans on graph exploration operators to find subgraph matches to a query. Even if they have shown to outperform relational-based systems in complex queries, they are barely scalable and optimization techniques are completely system dependent. Recently, some systems such as RDF_QDAG have shown that by combining graph exploration and triples clustering one can achieve a good compromise between performance and scalability. In this paper, we propose optimization strategies for this kind of RDF management systems. First, we define novel statistics collected for clusters of triples to better capture the dependencies found in the original graph. Second, we redefine an execution plan based on these logical structures which allows to represent the RDF graph exploration process. Third, we introduce an algorithm for selecting the optimal execution plan based on a customized cost model. Finally, we propose a new approach to refine the chosen plan by pruning invalid clusters that do not participate in the construction of the final query results. All our proposals are validated experimentally using well-known RDF benchmarks. (C) 2021 Elsevier Ltd. All rights reserved.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Efficient and scalable filtering of graph-based metadata
    Liu, HF
    Petrovic, M
    Jacobsen, HA
    JOURNAL OF WEB SEMANTICS, 2005, 3 (04): : 294 - 310
  • [2] Graph-Based Semantic Query Optimization for Intensional XML Data
    Alrefae, Abdullah
    Cao, Jinli
    Pardede, Eric
    COMPLEX, INTELLIGENT, AND SOFTWARE INTENSIVE SYSTEMS (CISIS 2019), 2020, 993 : 247 - 256
  • [3] Efficient and scalable motif discovery using graph-based search
    Sinha, Arnit U.
    Bhatnagar, Raj
    2007 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2007, : 197 - +
  • [4] GQARDF : A Graph-Based Approach Towards Efficient SPARQL Query Answering
    Wang, Xi
    Zhang, Qianzhen
    Guo, Deke
    Zhao, Xiang
    Yang, Jianye
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2020), PT II, 2020, 12113 : 551 - 568
  • [5] Graph-based cyclic multi-join query optimization algorithm
    Yu, H
    Wang, XK
    Zhang, JY
    ADVANCES IN WEB-AGE INFORMATION MANAGEMENT: PROCEEDINGS, 2004, 3129 : 745 - 750
  • [6] Graph-Based Web Query Classification
    Xia, Chunwei
    Wang, Xin
    2015 12TH WEB INFORMATION SYSTEM AND APPLICATION CONFERENCE (WISA), 2015, : 241 - 244
  • [7] Sample Efficient Graph-Based Optimization with Noisy Observations
    Nguyen, Tan
    Shameli, Ali
    Abbasi-Yadkori, Yasin
    Rao, Anup
    Kveton, Branislav
    22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
  • [8] Graph-Based Speculative Query Execution for RDBMS
    Sasak-Okon, Anna
    Tudruj, Marek
    PARALLEL PROCESSING AND APPLIED MATHEMATICS (PPAM 2017), PT I, 2018, 10777 : 303 - 313
  • [9] gStore: a graph-based SPARQL query engine
    Zou, Lei
    Oezsu, M. Tamer
    Chen, Lei
    Shen, Xuchuan
    Huang, Ruizhe
    Zhao, Dongyan
    VLDB JOURNAL, 2014, 23 (04): : 565 - 590
  • [10] Graph-Based Query Strategies for Active Learning
    Wu, Wei
    Ostendorf, Mari
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (02): : 260 - 269