Efficient Triangle Counting in Large Graphs via Degree-Based Vertex Partitioning

被引:58
|
作者
Kolountzakis, Mihail N. [1 ]
Miller, Gary L. [2 ]
Peng, Richard [2 ]
Tsourakakis, Charalampos E. [3 ]
机构
[1] Univ Crete, Dept Math, Knossou Ave, Iraklion 71409, Greece
[2] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA
[3] Carnegie Mellon Univ, Dept Math Sci, Pittsburgh, PA 15213 USA
基金
美国国家科学基金会;
关键词
D O I
10.1080/15427951.2012.625260
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The number of triangles is a computationally expensive graph statistic frequently used in complex network analysis (e.g., transitivity ratio), in various random graph models (e.g., exponential random graph model), and in important real-world applications such as spam detection, uncovering the hidden thematic structures in the Web, and link recommendation. Counting triangles in graphs with millions and billions of edges requires algorithms that run fast, use little space, provide accurate estimates of the number of triangles, and preferably are parallelizable. In this paper we present an efficient triangle-counting approximation algorithm that can be adapted to the semistreaming model [Feigenbaum et al. 05]. Its key idea is to combine the sampling algorithm of [Tsourakakis et al. 09, Tsourakakis et al. 11] and the partitioning of the set of vertices into high-and low-degree subsets as in [Alon et al. 97], treating each set appropriately. From a mathematical perspective, we present a simplified proof of [Tsourakakis et al. 11] that uses the powerful Kim-Vu concentration inequality [Kim and Vu 00] based on the Hajnal-Szemeredi theorem [Hajnal and Szemeredi 70]. Furthermore, we improve bounds of existing triple-sampling techniques based on a theorem of [Ahlswede and Katona 78]. We obtain a running time O(m + m(3/2) log n/t epsilon(2)) and an (1 +/- epsilon) approximation, where n is the number of vertices, m is the number of edges, and Delta is the maximum number of triangles in which any single edge is contained. Furthermore, we show how this algorithm can be adapted to the semistreaming model with space usage O(m(1/2) log n + m(3/2) log n/t epsilon(2)) and a constant number of passes (three) over the graph stream. We apply our methods to various networks with several millions of edges and we obtain excellent results, outperforming existing triangle-counting methods. Finally, we propose a random-projection-based method for triangle counting and provide a sufficient condition to obtain an estimate with low variance.
引用
收藏
页码:161 / 185
页数:25
相关论文
共 50 条
  • [1] Efficient Triangle Counting in Large Graphs via Degree-Based Vertex Partitioning
    Kolountzakis, Mihail N.
    Miller, Gary L.
    Peng, Richard
    Tsourakakis, Charalampos E.
    [J]. ALGORITHMS AND MODELS FOR THE WEB GRAPH, 2010, 6516 : 15 - +
  • [2] Degree-based energies of graphs
    Das, Kinkar Ch
    Gutman, Ivan
    Milovanovic, Igor
    Milovanovic, Emina
    Furtula, Boris
    [J]. LINEAR ALGEBRA AND ITS APPLICATIONS, 2018, 554 : 185 - 204
  • [3] First degree-based entropy of graphs
    A. Ghalavand
    M. Eliasi
    A. R. Ashrafi
    [J]. Journal of Applied Mathematics and Computing, 2019, 59 : 37 - 46
  • [4] First degree-based entropy of graphs
    Ghalavand, A.
    Eliasi, M.
    Ashrafi, A. R.
    [J]. JOURNAL OF APPLIED MATHEMATICS AND COMPUTING, 2019, 59 (1-2) : 37 - 46
  • [5] DEGREE-BASED GINI INDEX FOR GRAPHS
    Domicolo, Carly
    Mahmoud, Hosam
    [J]. PROBABILITY IN THE ENGINEERING AND INFORMATIONAL SCIENCES, 2020, 34 (02) : 157 - 171
  • [6] Differentially Private Triangle Counting in Large Graphs
    Ding, Xiaofeng
    Sheng, Shujun
    Zhou, Huajian
    Zhang, Xiaodong
    Bao, Zhifeng
    Zhou, Pan
    Jin, Hai
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (11) : 5278 - 5292
  • [7] On Degree-Based Topological Indices for Bicyclic Graphs
    Tache, Rozica-Maria
    [J]. MATCH-COMMUNICATIONS IN MATHEMATICAL AND IN COMPUTER CHEMISTRY, 2016, 76 (01) : 99 - 116
  • [8] On Degree-Based Topological Indices of Toeplitz Graphs
    Iqbal, R. M. K.
    Ahmad, M.
    Qayyum, A.
    Supadi, S. S.
    Hussain, M. J.
    Raza, S.
    [J]. INTERNATIONAL JOURNAL OF ANALYSIS AND APPLICATIONS, 2023, 21
  • [9] Comparative Study of Generalized Sum Graphs via Degree-Based Topological Indices
    Javaid, Muhammad
    Javed, Saira
    Bonyah, Ebenezer
    [J]. JOURNAL OF MATHEMATICS, 2022, 2022
  • [10] Privacy-Preserving Triangle Counting in Large Graphs
    Ding, Xiaofeng
    Zhang, Xiaodong
    Bao, Zhifeng
    Jin, Hai
    [J]. CIKM'18: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2018, : 1283 - 1292