GoFast: Graph-based optimization for efficient and scalable query evaluation

被引:4
|
作者
Zouaghi, Ishaq [1 ,3 ]
Mesmoudi, Amin [2 ]
Galicia, Jorge [1 ]
Bellatreche, Ladjel [1 ]
Aguili, Taoufik [3 ]
机构
[1] LIAS ISAE ENSMA, Chasseneuil, France
[2] Univ Poitiers, LIAS, Poitiers, France
[3] LR SysCom ENIT UTM, Tunis, Tunisia
关键词
Optimization; RDF; SPARQL; Cardinality estimation; Cost model;
D O I
10.1016/j.is.2021.101738
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The popularity of the Resource Description Framework (RDF) and SPARQL has thrust the development of high-performance systems to manage data represented with this model. Former approaches adapted the well-established relational model applying its storage, query processing, and optimization strategies. However, the borrowed techniques from the relational model are not universally applicable in the RDF context. First, the schema-free nature of RDF induces intensive joins overheads. Also, optimization strategies trying to find the optimal join order rely on error-prone statistics unable to capture all the correlations among triples. Graph-based approaches keep the graph structure of RDF representing the data directly as a graph. Their execution model leans on graph exploration operators to find subgraph matches to a query. Even if they have shown to outperform relational-based systems in complex queries, they are barely scalable and optimization techniques are completely system dependent. Recently, some systems such as RDF_QDAG have shown that by combining graph exploration and triples clustering one can achieve a good compromise between performance and scalability. In this paper, we propose optimization strategies for this kind of RDF management systems. First, we define novel statistics collected for clusters of triples to better capture the dependencies found in the original graph. Second, we redefine an execution plan based on these logical structures which allows to represent the RDF graph exploration process. Third, we introduce an algorithm for selecting the optimal execution plan based on a customized cost model. Finally, we propose a new approach to refine the chosen plan by pruning invalid clusters that do not participate in the construction of the final query results. All our proposals are validated experimentally using well-known RDF benchmarks. (C) 2021 Elsevier Ltd. All rights reserved.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] A graph-based decomposition approach for recursive query processing
    Seipel, Dietmar
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 1989, 344 LNCS : 148 - 165
  • [22] Graph-Based Speculative Query Execution in Relational Databases
    Sasak-Okon, Anna
    Tudruj, Marek
    2017 16TH INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING (ISPDC-2017), 2017, : 122 - 131
  • [23] Bipartite Graph-based Keyword Query Results Recommendation
    Feng, Limin
    Yang, Yan
    PROCEEDINGS 2013 INTERNATIONAL CONFERENCE ON MECHATRONIC SCIENCES, ELECTRIC ENGINEERING AND COMPUTER (MEC), 2013, : 1584 - 1589
  • [24] A Graph-Based Framework for Analyzing SQL Query Logs
    Wahl, Andreas M.
    Endler, Gregor
    Schwab, Peter K.
    Rith, Julian
    Herbst, Sebastian
    Lenz, Richard
    GRADES-NDA '18: PROCEEDINGS OF THE 1ST ACM SIGMOD JOINT INTERNATIONAL WORKSHOP ON GRAPH DATA MANAGEMENT EXPERIENCES & SYSTEMS (GRADES) AND NETWORK DATA ANALYTICS (NDA) 2018 (GRADES-NDA 2018), 2018,
  • [25] Query Adaptive Fusion for Graph-Based Visual Reranking
    Fang, Muyuan
    Zhang, Yu-Jin
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2017, 11 (06) : 908 - 917
  • [26] G-LOG - A GRAPH-BASED QUERY LANGUAGE
    PAREDAENS, J
    PEELMAN, P
    TANCA, L
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1995, 7 (03) : 436 - 453
  • [27] NREngine: A Graph-Based Query Engine for Network Reachability
    Li, Wenjie
    Zou, Lei
    Peng, Peng
    Qin, Zheng
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS: DASFAA 2021 INTERNATIONAL WORKSHOPS, 2021, 12680 : 90 - 106
  • [28] Efficient Graph-Based Document Similarity
    Paul, Christian
    Rettinger, Achim
    Mogadala, Aditya
    Knoblock, Craig A.
    Szekely, Pedro
    SEMANTIC WEB: LATEST ADVANCES AND NEW DOMAINS, 2016, 9678 : 334 - 349
  • [29] An Balanced, and Scalable Graph-Based Multiview Clustering Method
    Zhao, Zihua
    Nie, Feiping
    Wang, Rong
    Wang, Zheng
    Li, Xuelong
    IEEE Transactions on Knowledge and Data Engineering, 2024, 36 (12) : 7643 - 7656
  • [30] Efficient Graph-Based Image Segmentation
    Pedro F. Felzenszwalb
    Daniel P. Huttenlocher
    International Journal of Computer Vision, 2004, 59 : 167 - 181