Bipartite-Oriented Distributed Graph Partitioning for Big Learning

被引:8
|
作者
Chen, Rong [1 ]
Shi, Jia-Xin [1 ]
Chen, Hai-Bo [1 ]
Zang, Bin-Yu [1 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai Key Lab Scalable Comp & Syst, Inst Parallel & Distributed Syst, Shanghai 200240, Peoples R China
基金
新加坡国家研究基金会; 中国国家自然科学基金; 上海市科技启明星计划;
关键词
bipartite graph; graph partitioning; graph-parallel system;
D O I
10.1007/s11390-015-1501-x
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Many machine learning and data mining (MLDM) problems like recommendation, topic modeling, and medical diagnosis can be modeled as computing on bipartite graphs. However, most distributed graph-parallel systems are oblivious to the unique characteristics in such graphs and existing online graph partitioning algorithms usually cause excessive replication of vertices as well as significant pressure on network communication. This article identifies the challenges and opportunities of partitioning bipartite graphs for distributed MLDM processing and proposes BiGraph, a set of bipartite-oriented graph partitioning algorithms. BiGraph leverages observations such as the skewed distribution of vertices, discriminated computation load and imbalanced data sizes between the two subsets of vertices to derive a set of optimal graph partitioning algorithms that result in minimal vertex replication and network communication. BiGraph has been implemented on Power Graph and is shown to have a performance boost up to 17.75X (from 1.16X) for four typical MLDM algorithms, due to reducing up to 80% vertex replication, and up to 96% network traffic.
引用
收藏
页码:20 / 29
页数:10
相关论文
共 50 条
  • [21] Partitioning dynamic graph asynchronously with distributed FENNEL
    Shi, Zhan
    Li, Junhao
    Guo, Pengfei
    Li, Shuangshuang
    Feng, Dan
    Su, Yi
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2017, 71 : 32 - 42
  • [22] Co-Clustering by Directly Solving Bipartite Spectral Graph Partitioning
    Xue, Jingjing
    Nie, Feiping
    Liu, Chaodie
    Wang, Rong
    Li, Xuelong
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2024, : 7590 - 7601
  • [23] Big Data Oriented Graph Division and Storage
    Zhang, Zheng
    Zhao, Yanbin
    Fan, Xiaoxi
    Zhang, Bo
    [J]. 2023 IEEE 8TH INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS, ICBDA, 2023, : 47 - 52
  • [24] AKIN : A Streaming Graph Partitioning Algorithm for Distributed Graph Storage Systems
    Zhang, Wei
    Chen, Yong
    Dai, Dong
    [J]. 2018 18TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2018, : 183 - 192
  • [25] IOGP: An Incremental Online Graph Partitioning Algorithm for Distributed Graph Databases
    Dai, Dong
    Zhang, Wei
    Chen, Yong
    [J]. HPDC'17: PROCEEDINGS OF THE 26TH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE PARALLEL AND DISTRIBUTED COMPUTING, 2017, : 219 - 229
  • [26] Graph partitioning using learning automata
    Oommen, BJ
    deStCroix, EV
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 1996, 45 (02) : 195 - 208
  • [27] Partitioning the edge set of a bipartite graph into chain packings: complexity of some variations
    de Werra, D
    [J]. LINEAR ALGEBRA AND ITS APPLICATIONS, 2003, 368 : 315 - 327
  • [28] Co-clustering documents and words using bipartite isoperimetric graph partitioning
    Rege, Manjeet
    Dong, Ming
    Fotouhi, Farshad
    [J]. ICDM 2006: Sixth International Conference on Data Mining, Proceedings, 2006, : 532 - 541
  • [29] Bipartite spectral graph partitioning for clustering dialect varieties and detecting their linguistic features
    Wieling, Martijn
    Nerbonne, John
    [J]. COMPUTER SPEECH AND LANGUAGE, 2011, 25 (03): : 700 - 715
  • [30] An adaptive distributed system diagnosis based on graph partitioning
    Jeon, G
    Cho, Y
    [J]. INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS I-IV, PROCEEDINGS, 1998, : 726 - 731