Efficient MapReduce algorithms for triangle listing in billion-scale graphs

被引:7
|
作者
Zhu, Yuanyuan [1 ]
Zhang, Hao [1 ]
Qin, Lu [2 ]
Cheng, Hong [3 ]
机构
[1] Wuhan Univ, Sch Comp, State Key Lab Software Engn, Wuhan, Peoples R China
[2] Univ Technol Sydney, Ctr Quantum Computat & Intelligent Syst, Sydney, NSW, Australia
[3] Chinese Univ Hong Kong, Hong Kong, Hong Kong, Peoples R China
基金
澳大利亚研究理事会; 美国国家科学基金会;
关键词
Triangle listing; MapReduce; Massive graph; Filtering;
D O I
10.1007/s10619-017-7193-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper addresses the classical triangle listing problem, which aims at enumerating all the tuples of three vertices connected with each other by edges. This problem has been intensively studied in internal and external memory, but it is still an urgent challenge in distributed environment where multiple machines across the network can be utilized to achieve good performance and scalability. As one of the de facto computing methodologies in distributed environment, MapReduce has been used in some of existing triangle listing algorithms. However, these algorithms usually need to shuffle a huge amount of intermediate data, which seriously hinders their scalability on large scale graphs. In this paper, we propose a new triangle listing algorithm in MapReduce, FTL, which utilizes a light weight data structure to substantially reduce the intermediate data transferred during the shuffle stage, and also is equipped with multiple-round techniques to ease the burden on memory and network bandwidth when dealing with graphs at billion scale. We prove that the size of the intermediate data can be well bounded near to the number of triangles in the graph. To further reduce the shuffle size and memory cost, we also propose improved algorithms based on a compact data structure, and present several optimization techniques to accelerate the computation and reduce the memory consumption. The extensive experimental results show that our algorithms outperform existing competitors by several times on both synthetic graphs and real world graphs.
引用
收藏
页码:149 / 176
页数:28
相关论文
共 50 条
  • [1] Efficient MapReduce algorithms for triangle listing in billion-scale graphs
    Yuanyuan Zhu
    Hao Zhang
    Lu Qin
    Hong Cheng
    [J]. Distributed and Parallel Databases, 2017, 35 : 149 - 176
  • [2] Efficient Triangle Listing for Billion-Scale Graphs
    Zhang, Hao
    Zhu, Yuanyuan
    Qin, Lu
    Cheng, Hong
    Yu, Jeffrey Xu
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 813 - 822
  • [3] Bermuda: An Efficient MapReduce Triangle Listing Algorithm for Web-Scale Graphs
    Xiao, Dongqing
    Eltabakh, Mohamed
    Kong, Xiangnan
    [J]. 28TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT (SSDBM) 2016), 2016,
  • [4] Scalable and Adaptive Algorithms for the Triangle Interdiction Problem on Billion-Scale Networks
    Kuhnle, Alan
    Crawford, Victoria G.
    Thai, My T.
    [J]. 2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2017, : 237 - 246
  • [5] Efficient structural node similarity computation on billion-scale graphs
    Xiaoshuang Chen
    Longbin Lai
    Lu Qin
    Xuemin Lin
    [J]. The VLDB Journal, 2021, 30 : 471 - 493
  • [6] Efficient structural node similarity computation on billion-scale graphs
    Chen, Xiaoshuang
    Lai, Longbin
    Qin, Lu
    Lin, Xuemin
    [J]. VLDB JOURNAL, 2021, 30 (03): : 471 - 493
  • [7] PEGASUS: MINING BILLION-SCALE GRAPHS IN THE CLOUD
    Kang, U.
    Chau, Duen Horng Polo
    Faloutsos, Christos
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 5341 - 5344
  • [8] HEigen: Spectral Analysis for Billion-Scale Graphs
    Kang, U.
    Meeder, Brendan
    Papalexakis, Evangelos E.
    Faloutsos, Christos
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (02) : 350 - 362
  • [9] Mining billion-scale tensors: algorithms and discoveries
    Jeon, Inah
    Papalexakis, Evangelos E.
    Faloutsos, Christos
    Sael, Lee
    Kang, U.
    [J]. VLDB JOURNAL, 2016, 25 (04): : 519 - 544
  • [10] Mining billion-scale tensors: algorithms and discoveries
    Inah Jeon
    Evangelos E. Papalexakis
    Christos Faloutsos
    Lee Sael
    U. Kang
    [J]. The VLDB Journal, 2016, 25 : 519 - 544