Estimating the Impact of Communication Schemes for Distributed Graph Processing

被引:0
|
作者
Ye, Tian [1 ]
Kuppannagari, Sanmukh R. [1 ]
De Rose, Cesar A. F.
Wijeratne, Sasindu [1 ]
Kannan, Rajgopal [1 ]
Prasanna, Viktor K. [1 ]
机构
[1] Univ Southern Calif, Dept Comp Sci, Los Angeles, CA 90007 USA
关键词
Distributed Graph Processing; Performance Estimation; Communication Schemes; Cluster Computing;
D O I
10.1109/ISPDC55340.2022.00016
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Extreme scale graph analytics is imperative for several real-world Big Data applications with the underlying graph structure containing millions or billions of vertices and edges. Since such huge graphs cannot fit into the memory of a single computer, distributed processing of the graph is required. Several frameworks have been developed for performing graph processing on distributed systems. The frameworks focus primarily on choosing the right computation model and the partitioning scheme under the assumption that such design choices will automatically reduce the communication overheads. For any computational model and partitioning scheme, communication schemes - the data to be communicated and the virtual interconnection network among the nodes - have significant impact on the performance. To analyze this impact, in this work, we identify widely used communication schemes and estimate their performance. Analyzing the trade-offs between the number of compute nodes and communication costs of various schemes on a distributed platform by brute force experimentation can be prohibitively expensive. Thus, our performance estimation models provide an economic way to perform the analyses given the partitions and the communication scheme as input. We validate our model on a local HPC cluster as well as the cloud hosted NSF Chameleon cluster. Using our estimates as well as the actual measurements, we compare the communication schemes and provide conditions under which one scheme should be preferred over the others.
引用
收藏
页码:49 / 56
页数:8
相关论文
共 50 条
  • [1] Estimating communication costs for distributed XML query processing
    Park, Jong-Hyun
    Kang, Ji-Hoon
    [J]. IET COMMUNICATIONS, 2013, 7 (08) : 766 - 773
  • [2] LightGraph: Lighten Communication in Distributed Graph-Parallel Processing
    Zhao, Yue
    Yoshigoe, Kenji
    Xie, Mengjun
    Zhou, Suijian
    Seker, Remzi
    Bian, Jiang
    [J]. 2014 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS), 2014, : 717 - 724
  • [3] Graph Partitioning for Distributed Graph Processing
    Onizuka M.
    Fujimori T.
    Shiokawa H.
    [J]. Data Science and Engineering, 2017, 2 (1) : 94 - 105
  • [4] Data Replication for Distributed Graph Processing
    Ho, Li-Yung
    Wu, Jan-Jan
    Liu, Pangfeng
    [J]. 2013 IEEE SIXTH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD 2013), 2013, : 319 - 326
  • [5] An Elasticity Study of Distributed Graph Processing
    Au, Sietse
    Uta, Alexandru
    Ilyushkin, Alexey
    Iosup, Alexandru
    [J]. 2018 18TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2018, : 382 - 383
  • [6] SIGNAL PROCESSING ON GRAPHS: ESTIMATING THE STRUCTURE OF A GRAPH
    Mei, Jonathan
    Moura, Jose M. F.
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5495 - 5499
  • [7] Distributed PathGraph: A Cluster Centric Framework for Distributed Processing Graph
    Long, Hao
    Yuan, Pingpeng
    Jin, Hai
    Ding, Xiaofeng
    [J]. 2017 IEEE 10TH CONFERENCE ON SERVICE-ORIENTED COMPUTING AND APPLICATIONS (SOCA), 2017, : 34 - 41
  • [8] A Lightweight Communication Runtime for Distributed Graph Analytics
    Hoang-Vu Dang
    Brooks, Alex
    Dryden, Nikoli
    Snir, Marc
    Dathathri, Roshan
    Gill, Gurbinder
    Lenharth, Andrew
    Loc Hoang
    Pingali, Keshav
    [J]. 2018 32ND IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2018, : 980 - 989
  • [9] OPTIMIZED QUANTIZATION IN DISTRIBUTED GRAPH SIGNAL PROCESSING
    Nobre, Isabela Cunha Maia
    Frossard, Pascal
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5376 - 5380
  • [10] Tuning the granularity of parallelism for distributed graph processing
    Xinyuan Luo
    Sai Wu
    Wei Wang
    Lidan Shou
    [J]. Distributed and Parallel Databases, 2017, 35 : 117 - 148