Distributed block formation and layout for disk-based management of large-scale graphs

被引:0
|
作者
Abdurrahman Yaşar
Buğra Gedik
Hakan Ferhatosmanoğlu
机构
[1] Georgia Institute of Technology,College of Computing
[2] Bilkent University,Department of Computer Engineering
来源
关键词
Graph management systems; Locality; Layout; Large scale graphs; Database management; Distributed systems;
D O I
暂无
中图分类号
学科分类号
摘要
We are witnessing an enormous growth in social networks as well as in the volume of data generated by them. An important portion of this data is in the form of graphs. In recent years, several graph processing and management systems emerged to handle large-scale graphs. The primary goal of these systems is to run graph algorithms and queries in an efficient and scalable manner. Unlike relational data, graphs are semi-structured in nature. Thus, storing and accessing graph data using secondary storage requires new solutions that can provide locality of access for graph processing workloads. In this work, we propose a scalable block formation and layout technique for graphs, which aims at reducing the I/O cost of disk-based graph processing algorithms. To achieve this, we designed a scalable MapReduce-style method called ICBL, which can divide the graph into a series of disk blocks that contain sub-graphs with high locality. Furthermore, ICBL can order the resulting blocks on disk to further reduce non-local accesses. We experimentally evaluated ICBL to showcase its scalability, layout quality, as well as the effectiveness of automatic parameter tuning for ICBL. We deployed the graph layouts generated by ICBL on the Neo4j open source graph database, http://www.neo4j.org/ (2015) graph database management system. Our results show that the layout generated by ICBL reduces the query running times over Neo4j more than 2×\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2\times $$\end{document} compared to the default layout.
引用
收藏
页码:23 / 53
页数:30
相关论文
共 50 条
  • [1] Distributed block formation and layout for disk-based management of large-scale graphs
    Yasar, Abdurrahman
    Gedik, Bugra
    Ferhatosmanoglu, Hakan
    [J]. DISTRIBUTED AND PARALLEL DATABASES, 2017, 35 (01) : 23 - 53
  • [2] Disk-Based Management of Interaction Graphs
    Gedik, Bugra
    Bordawekar, Rajesh
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (11) : 2689 - 2702
  • [3] An Efficient Disk-Based Discontinuous Deformation Analysis Model for Simulating Large-Scale Problems
    Huang, Gang-Hai
    Xu, Yuan-Zhen
    Yi, Xiong-Wei
    Xia, Ming
    [J]. INTERNATIONAL JOURNAL OF GEOMECHANICS, 2020, 20 (07)
  • [4] Efficient layout transformation for disk-based multidimensional arrays
    Krishnamoorthy, S
    Baumgartner, G
    Lam, CC
    Nieplocha, J
    Sadayappan, P
    [J]. HIGH PERFORMANCE COMPUTING - HIPC 2004, 2004, 3296 : 386 - 398
  • [5] A Performance Study on Large-Scale Data Analytics Using Disk-Based and In-Memory Database Systems
    Chao, Pingfu
    He, Dan
    Sadiq, Shazia
    Zheng, Kai
    Zhou, Xiaofang
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2017, : 247 - 254
  • [6] New strategies for developing GPU accelerated disk-based discontinuous deformation analysis for large-scale modeling
    Liu, Feng
    Chen, Zediao
    Xia, Kaiwen
    Xu, Dongdong
    Yang, Yongtao
    [J]. INTERNATIONAL JOURNAL FOR NUMERICAL AND ANALYTICAL METHODS IN GEOMECHANICS, 2023, 47 (05) : 841 - 861
  • [7] Distributed disk-based algorithms for model checking very large Markov chains
    Bell, Alexander
    Haverkort, Boudewijn R.
    [J]. FORMAL METHODS IN SYSTEM DESIGN, 2006, 29 (02) : 177 - 196
  • [8] Distributed disk-based algorithms for model checking very large Markov chains
    Alexander Bell
    Boudewijn R. Haverkort
    [J]. Formal Methods in System Design, 2006, 29 : 177 - 196
  • [9] BiShard parallel processor: A disk-based processing engine for billion-scale graphs
    Najeebullah, Kamran
    Khan, Kifayat Ullah
    Nawaz, Waqas
    Lee, Young-Koo
    [J]. International Journal of Multimedia and Ubiquitous Engineering, 2014, 9 (02): : 199 - 212
  • [10] Distributed Shortest Distance Labeling on Large-Scale Graphs
    Zeng, Yuanyuan
    Ma, Chenhao
    Fang, Yixiang
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (10): : 2641 - 2653