Efficient Triangle Listing for Billion-Scale Graphs

被引:0
|
作者
Zhang, Hao [1 ]
Zhu, Yuanyuan [1 ]
Qin, Lu [2 ]
Cheng, Hong [3 ]
Yu, Jeffrey Xu [3 ]
机构
[1] Wuhan Univ, State Key Lab Software Engn, Wuhan, Peoples R China
[2] Univ Technol Sydney, Ctr Quantum Computat & Intelligent Syst, Sydney, NSW, Australia
[3] Chinese Univ Hong Kong, Hong Kong, Hong Kong, Peoples R China
基金
美国国家科学基金会; 澳大利亚研究理事会;
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper addresses the classical triangle listing problem, which aims at enumerating all the tuples of three vertices connected with each other by edges. This problem has been intensively studied in internal and external memory, but it is still an urgent challenge in distributed environment where multiple machines across the network can be utilized to achieve good performance and scalability. As one of the de facto computing methodologies in distributed environment, MapReduce has been used in some of existing triangle listing algorithms. However, these algorithms usually need to shuffle a huge amount of intermediate data, which seriously hinders the scalability on large scale graphs. In this paper, we propose a new triangle listing algorithm in MapReduce, FTL, which utilizes a light weight data structure to substantially reduce the intermediate data transferred during the shuffle stage, and also is equipped with multiple-round techniques to ease the burden on memory and network bandwidth when dealing with graphs at billion scale. We prove that the size of the intermediate data can be well bounded near to the number of triangles in the graph. To further reduce the shuffle size in each round, we also devise a compact data structure to store the intermediate data, which can save space up to 2/3. The extensive experimental results show that our algorithms outperform existing competitors by several times on large real world graphs.
引用
收藏
页码:813 / 822
页数:10
相关论文
共 50 条
  • [1] Efficient MapReduce algorithms for triangle listing in billion-scale graphs
    Zhu, Yuanyuan
    Zhang, Hao
    Qin, Lu
    Cheng, Hong
    [J]. DISTRIBUTED AND PARALLEL DATABASES, 2017, 35 (02) : 149 - 176
  • [2] Efficient MapReduce algorithms for triangle listing in billion-scale graphs
    Yuanyuan Zhu
    Hao Zhang
    Lu Qin
    Hong Cheng
    [J]. Distributed and Parallel Databases, 2017, 35 : 149 - 176
  • [3] Efficient structural node similarity computation on billion-scale graphs
    Xiaoshuang Chen
    Longbin Lai
    Lu Qin
    Xuemin Lin
    [J]. The VLDB Journal, 2021, 30 : 471 - 493
  • [4] Efficient structural node similarity computation on billion-scale graphs
    Chen, Xiaoshuang
    Lai, Longbin
    Qin, Lu
    Lin, Xuemin
    [J]. VLDB JOURNAL, 2021, 30 (03): : 471 - 493
  • [5] PEGASUS: MINING BILLION-SCALE GRAPHS IN THE CLOUD
    Kang, U.
    Chau, Duen Horng Polo
    Faloutsos, Christos
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 5341 - 5344
  • [6] HEigen: Spectral Analysis for Billion-Scale Graphs
    Kang, U.
    Meeder, Brendan
    Papalexakis, Evangelos E.
    Faloutsos, Christos
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (02) : 350 - 362
  • [7] Spectral Analysis for Billion-Scale Graphs: Discoveries and Implementation
    Kang, U.
    Meeder, Brendan
    Faloutsos, Christos
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT II: 15TH PACIFIC-ASIA CONFERENCE, PAKDD 2011, 2011, 6635 : 13 - 25
  • [8] Efficient Indexing of Billion-Scale datasets of deep descriptors
    Babenko, Artem
    Lempitsky, Victor
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2055 - 2063
  • [9] Scalable and Adaptive Algorithms for the Triangle Interdiction Problem on Billion-Scale Networks
    Kuhnle, Alan
    Crawford, Victoria G.
    Thai, My T.
    [J]. 2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2017, : 237 - 246
  • [10] Accurate and Scalable Graph Neural Networks for Billion-Scale Graphs
    Zeng, Juxiang
    Wang, Pinghui
    Lan, Lin
    Zhao, Junzhou
    Sun, Feiyang
    Tao, Jing
    Feng, Junlan
    Hu, Min
    Guan, Xiaohong
    [J]. 2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, : 110 - 122