Hybrid Pulling/Pushing for I/O-Efficient Distributed and Iterative Graph Computing

被引:23
|
作者
Wang, Zhigang [1 ]
Gu, Yu [1 ]
Bao, Yubin [1 ]
Yu, Ge [1 ]
Yu, Jeffrey Xu [2 ]
机构
[1] Northeastern Univ, Shenyang, Liaoning, Peoples R China
[2] Chinese Univ Hong Kong, Hong Kong, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
I/O-Efficient; Distributed Graph Computing; Push; Pull; FRAMEWORK;
D O I
10.1145/2882903.2882938
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Billion-node graphs are rapidly growing in size in many applications such as online social networks. Most graph algorithms generate a large number of messages during iterative computations. Vertex-centric distributed systems usually store graph data and message data on disk to improve scalability. Currently, these distributed systems with disk-resident data take a push-based approach to handle messages. This works well if few messages reside on disk. Otherwise, it is I/O-inefficient due to expensive random writes. By contrast, the existing memory-resident pull-based approach individually pulls messages for each vertex on demand. Although it can be used to avoid disk operations regarding messages, expensive I/O costs are incurred by random and frequent access to vertices. This paper proposes a hybrid solution to support switching between push and pull adaptively, to obtain optimal performance for distributed systems with disk-resident data in different scenarios. We first employ a new block-centric technique (b-pull) to improve the I/O-performance of pulling messages, although the iterative computation is vertex-centric. I/O costs of data accesses are shifted from the receiver side where messages are written/read by push to the sender side where graph data are read by b-pull. Graph data are organized by clustering vertices and edges to achieve high I/O efficiency in b-pull. Second, we design a seamless switching mechanism and a prominent performance prediction method to guarantee efficiency when switching between push and b-pull. We conduct extensive performance studies to confirm the effectiveness of our proposals over existing up-to-date solutions using a broad spectrum of real-world graphs.
引用
收藏
页码:479 / 494
页数:16
相关论文
共 50 条
  • [1] HGraph: I/O-Efficient Distributed and Iterative Graph Computing by Hybrid Pushing/Pulling
    Wang, Zhigang
    Gu, Yu
    Bao, Yubin
    Yu, Ge
    Yu, Jeffrey Xu
    Wei, Zhiqiang
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (05) : 1973 - 1987
  • [2] I/O-Efficient Statistical Computing with RIOT
    Zhang, Yi
    Zhang, Weiping
    Yang, Jun
    [J]. 26TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING ICDE 2010, 2010, : 1157 - 1160
  • [3] An I/O-efficient and adaptive fault-tolerant framework for distributed graph computations
    Wang, Zhigang
    Gu, Yu
    Bao, Yubin
    Yu, Ge
    Gao, Lixin
    [J]. DISTRIBUTED AND PARALLEL DATABASES, 2017, 35 (02) : 177 - 196
  • [4] An I/O-efficient and adaptive fault-tolerant framework for distributed graph computations
    Zhigang Wang
    Yu Gu
    Yubin Bao
    Ge Yu
    Lixin Gao
    [J]. Distributed and Parallel Databases, 2017, 35 : 177 - 196
  • [5] I/O-Efficient Algorithms for Computing Contours on a Terrain
    Agarwal, Pankaj K.
    Arge, Lars
    Molhave, Thomas
    Sadri, Bardia
    [J]. PROCEEDINGS OF THE TWENTY-FOURTH ANNUAL SYMPOSIUM ON COMPUTATIONAL GEOMETRY (SGG'08), 2008, : 129 - 138
  • [6] A Hybrid Update Strategy for I/O-Efficient Out-of-Core Graph Processing
    Xu, Xianghao
    Wang, Fang
    Jiang, Hong
    Chen, Yongli
    Feng, Dan
    Zhang, Yongxuan
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2020, 31 (08) : 1767 - 1782
  • [7] HUS-Graph: I/O-Efficient Out-of-Core Graph Processing with Hybrid Update Strategy
    Xu, Xianghao
    Wang, Fang
    Jiang, Hong
    Cheng, Yongli
    Feng, Dan
    Zhang, Yongxuan
    [J]. PROCEEDINGS OF THE 47TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, 2018,
  • [8] GraphCP: An I/O-Efficient Concurrent Graph Processing Framework
    Xu, Xianghao
    Wang, Fang
    Jiang, Hong
    Cheng, Yongli
    Feng, Dan
    Zhang, Yongxuan
    Fang, Peng
    [J]. 2021 IEEE/ACM 29TH INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE (IWQOS), 2021,
  • [9] I/O-efficient algorithms for computing planar geometric spanners
    Maheshwari, Anil
    Smid, Michiel
    Zeh, Norbert
    [J]. COMPUTATIONAL GEOMETRY-THEORY AND APPLICATIONS, 2008, 40 (03): : 252 - 271
  • [10] I/O-efficient multilevel graph partitioning algorithm for massive graph data
    Her, JH
    Ramakrishna, RS
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (07) : 1789 - 1794