Better Algorithms for Counting Triangles in Data Streams

被引:36
|
作者
McGregor, Andrew [1 ]
Vorotnikova, Sofya [1 ]
Vu, Hoa T. [1 ]
机构
[1] Univ Massachusetts, Amherst, MA 01003 USA
关键词
data streams; triangles; clustering coefficients; GRAPH; SUBGRAPH;
D O I
10.1145/2902251.2902283
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We present space-efficient data stream algorithms for approximating the number of triangles in a graph up to a factor 1 + epsilon. While it can be shown that determining whether a graph is triangle-free is not possible in sub-linear space, a large body of work has focused on minimizing the space required in terms of the number of triangles T (or a lower bound on this quantity) and other parameters including the number of nodes n and the number of edges m. Two models are important in the literature: the arbitrary order model in which the stream consists of the edges of the graph in arbitrary order and the adjacency list order model in which all edges incident to the same node appear consecutively. We improve over the state of the art results in both models. For the adjacency list order model, we show that (O) over tilde (c(-2)mR/root T) space is sufficient in one pass and (O) over tilde(epsilon(-2)m(3/2)/T) space is sufficient in two passes where the (O) over tilde(.) notation suppresses log factors. For the arbitrary order model, we show that (O) over tilde (epsilon(-2)m/root T) space suffices given two passes and that (O) over tilde(epsilon(-2)m(3/2)/T) space suffices given three passes and oracle access to the degrees. Finally, we show how to efficiently implement the "wedge sampling" approach to triangle estimation in the arbitrary order model. To do this, we develop the first algorithm for fp sampling such that multiple independent samples can be generated with O (polylog n) update time; this primitive is widely applicable and this result may be of independent interest.
引用
下载
收藏
页码:401 / 411
页数:11
相关论文
共 50 条
  • [1] A second look at counting triangles in graph streams
    Cormode, Graham
    Jowhari, Hossein
    THEORETICAL COMPUTER SCIENCE, 2014, 552 : 44 - 51
  • [2] A second look at counting triangles in graph streams (corrected)
    Cormode, Graham
    Jowhari, Hossein
    THEORETICAL COMPUTER SCIENCE, 2017, 683 : 22 - 30
  • [3] New streaming algorithms for counting triangles in graphs
    Jowhari, H
    Ghodsi, M
    COMPUTING AND COMBINATORICS, PROCEEDINGS, 2005, 3595 : 710 - 716
  • [4] Reductions in streaming algorithms, with an application to counting triangles in graphs
    Bar-Yossef, Z
    Kumar, R
    Sivakumar, D
    PROCEEDINGS OF THE THIRTEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2002, : 623 - 632
  • [5] Parallel Algorithms for Counting Triangles and Computing Clustering Coefficients
    Arifuzzaman, S. M.
    Khan, Maleq
    Marathe, Madhav V.
    2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC), 2012, : 1450 - 1450
  • [6] Parallel Algorithms for Counting Triangles and Computing Clustering Coefficients
    Arifuzzaman, S. M.
    Khan, Maleq
    Marathe, Madhav
    2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC), 2012, : 1448 - +
  • [7] Fast Counting of Triangles in Large Real Networks without counting: Algorithms and Laws
    Tsourakakis, Charalampos E.
    ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, : 608 - 617
  • [8] Efficiently Counting Triangles for Hypergraph Streams by Reservoir-Based Sampling
    Zhang, Lingling
    Zhang, Zhiwei
    Wang, Guoren
    Yuan, Ye
    Zhao, Kangfei
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (11) : 11328 - 11341
  • [9] Fast Parallel Algorithms for Counting and Listing Triangles in Big Graphs
    Arifuzzaman, Shaikh
    Khan, Maleq
    Marathe, Madhav
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2020, 14 (01)
  • [10] COUNTING TRIANGLES
    MITRINOVIC, DS
    COHEN, DIA
    AMERICAN MATHEMATICAL MONTHLY, 1964, 71 (08): : 925 - &