An Efficient MapReduce Algorithm for Counting Triangles in a Very Large Graph

被引:30
|
作者
Park, Ha-Myung [1 ]
Chung, Chin-Wan [2 ]
机构
[1] Korea Adv Inst Sci & Technol, Div Web Sci & Technol, 291 Daehak Ro, Daejeon, South Korea
[2] Korea Adv Inst Sci & Technol, Div Web Sci & Technol, Dept Comp Sci, Daejeon, South Korea
基金
新加坡国家研究基金会;
关键词
Graph; triangle; MapReduce;
D O I
10.1145/2505515.2505563
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Triangle counting problem is one of the fundamental problem in various domains. The problem can be utilized for computation of clustering coefficient, transitivity, trianglular connectivity, trusses, etc. The problem have been extensively studied in internal memory but the algorithms are not scalable for enormous graphs. In recent years, the MapReduce has emerged as a de facto standard framework for processing large data through parallel computing. A MapReduce algorithm was proposed for the problem based on graph partitioning. However, the algorithm redundantly generates a large number of intermediate data that cause network overload and prolong the processing time. In this paper, we propose a new algorithm based on graph partitioning with a novel idea of triangle classification to count the number of triangles in a graph. The algorithm substantially reduces the duplication by classifying triangles into three types and processing each triangle differently according to its type. In the experiments, we compare the proposed algorithm with recent existing algorithms using both synthetic datasets and real-world datasets that are composed of millions of nodes and billions of edges. The proposed algorithm outperforms other algorithms in most cases. Especially, for a twitter dataset, the proposed algorithm is more than twice as fast as existing MapReduce algorithms. Moreover, the performance gap increases as the graph becomes larger and denser.
引用
收藏
页码:539 / 548
页数:10
相关论文
共 50 条
  • [1] Graph partitioning MapReduce-based algorithms for counting triangles in large-scale graphs
    Ahmed Sharafeldeen
    Mohammed Alrahmawy
    Samir Elmougy
    [J]. Scientific Reports, 13
  • [2] Graph partitioning MapReduce-based algorithms for counting triangles in large-scale graphs
    Sharafeldeen, Ahmed
    Alrahmawy, Mohammed
    Elmougy, Samir
    [J]. SCIENTIFIC REPORTS, 2023, 13 (01)
  • [3] COUNTING TRIANGLES IN MASSIVE GRAPHS WITH MAPREDUCE
    Kolda, Tamara G.
    Pinar, Ali
    Plantenga, Todd
    Seshadhri, C.
    Task, Christine
    [J]. SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2014, 36 (05): : S48 - S77
  • [4] Efficient Estimation of Triangles in Very Large Graphs
    Etemadi, Roohollah
    Lu, Jianguo
    Tsin, Yung H.
    [J]. CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, : 1251 - 1260
  • [5] Comparing MapReduce and Pipeline Implementations for Counting Triangles
    Pasarella, Edelmira
    Vidal, Maria-Esther
    Zoltan, Cristina
    [J]. ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE, 2017, (237): : 20 - 33
  • [6] A second look at counting triangles in graph streams
    Cormode, Graham
    Jowhari, Hossein
    [J]. THEORETICAL COMPUTER SCIENCE, 2014, 552 : 44 - 51
  • [7] Counting and Sampling Triangles from a Graph Stream
    Pavan, A.
    Tangwongsan, Kanat
    Tirthapura, Srikanta
    Wu, Kun-Lung
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2013, 6 (14): : 1870 - 1881
  • [8] MASCOT: Memory-efficient and Accurate Sampling for Counting Local Triangles in Graph Streams
    Lim, Yongsub
    Kang, U.
    [J]. KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, : 685 - 694
  • [9] Counting Triangles in Large Graphs on GPU
    Polak, Adam
    [J]. 2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2016, : 740 - 746
  • [10] A Space-efficient Parallel Algorithm for Counting Exact Triangles in Massive Networks
    Arifuzzaman, Shaikh
    Khan, Maleq
    Marathe, Madhav
    [J]. 2015 IEEE 17TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2015 IEEE 7TH INTERNATIONAL SYMPOSIUM ON CYBERSPACE SAFETY AND SECURITY, AND 2015 IEEE 12TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (ICESS), 2015, : 527 - 534