CoCoS: Fast and Accurate Distributed Triangle Counting in Graph Streams

被引:6
|
作者
Shin, Kijung [1 ]
Lee, Euiwoong [2 ]
Oh, Jinoh [3 ]
Hammoud, Mohammad [4 ]
Faloutsos, Christos [3 ]
机构
[1] Korea Adv Inst Sci & Technol, 291 Daehak Ro, Daejeon 34141, South Korea
[2] Univ Michigan, 500 S State St, Ann Arbor, MI 48109 USA
[3] Carnegie Mellon Univ, 5000 Forbes Ave, Pittsburgh, PA 15213 USA
[4] Carnegie Mellon Univ Qatar, Doha 24866, Qatar
基金
新加坡国家研究基金会; 美国国家科学基金会;
关键词
Graph stream; triangle counting; sampling; streaming algorithms; distributed algorithms;
D O I
10.1145/3441487
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Given a graph stream, how can we estimate the number of triangles in it using multiple machines with limited storage? Specifically, how should edges be processed and sampled across the machines for rapid and accurate estimation? The count of triangles (i.e., cliques of size three) has proven useful in numerous applications, including anomaly detection, community detection, and link recommendation. For triangle counting in large and dynamic graphs, recent work has focused largely on streaming algorithms and distributed algorithms but little on their combinations for "the best of both worlds." In this work, we propose CoCoS, a fast and accurate distributed streaming algorithm for estimating the counts of global triangles (i.e., all triangles) and local triangles incident to each node. Making one pass over the input stream, CoCoS carefully processes and stores the edges across multiple machines so that the redundant use of computational and storage resources is minimized. Compared to baselines, CoCoS is: (a) accurate: giving up to 39x smaller estimation error; (b) fast: up to 10.4x faster, scaling linearly with the size of the input stream; and (c) theoretically sound: yielding unbiased estimates.
引用
收藏
页数:30
相关论文
共 50 条
  • [1] Fast, Accurate and Provable Triangle Counting in Fully Dynamic Graph Streams
    Shin, Kijung
    Oh, Sejoon
    Kim, Jisu
    Hooi, Bryan
    Faloutsos, Christos
    [J]. ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2020, 14 (02)
  • [2] Triangle Counting in Dynamic Graph Streams
    Laurent Bulteau
    Vincent Froese
    Konstantin Kutzkov
    Rasmus Pagh
    [J]. Algorithmica, 2016, 76 : 259 - 278
  • [3] Triangle Counting in Dynamic Graph Streams
    Kutzkov, Konstantin
    Pagh, Rasmus
    [J]. ALGORITHM THEORY - SWAT 2014, 2014, 8503 : 306 - 318
  • [4] Triangle Counting in Dynamic Graph Streams
    Bulteau, Laurent
    Froese, Vincent
    Kutzkov, Konstantin
    Pagh, Rasmus
    [J]. ALGORITHMICA, 2016, 76 (01) : 259 - 278
  • [5] WRS: Waiting Room Sampling for Accurate Triangle Counting in Real Graph Streams
    Shin, Kijung
    [J]. 2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2017, : 1087 - 1092
  • [6] Think Before You Discard: Accurate Triangle Counting in Graph Streams with Deletions
    Shin, Kijung
    Kim, Jisu
    Hooi, Bryan
    Faloutsos, Christos
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2018, PT II, 2019, 11052 : 141 - 157
  • [7] A distributed streaming framework for edge-cloud triangle counting in graph streams
    Yang, Xu
    Song, Chao
    Gu, Jiqing
    Li, Ke
    Li, Hongwei
    [J]. KNOWLEDGE-BASED SYSTEMS, 2023, 278
  • [8] Temporal locality-aware sampling for accurate triangle counting in real graph streams
    Lee, Dongjin
    Shin, Kijung
    Faloutsos, Christos
    [J]. VLDB JOURNAL, 2020, 29 (06): : 1501 - 1525
  • [9] Temporal locality-aware sampling for accurate triangle counting in real graph streams
    Dongjin Lee
    Kijung Shin
    Christos Faloutsos
    [J]. The VLDB Journal, 2020, 29 : 1501 - 1525
  • [10] BSR-TC: Adaptively Sampling for Accurate Triangle Counting over Evolving Graph Streams
    Xuan, Wei
    Cao, Huawei
    Yan, Mingyu
    Tang, Zhimin
    Ye, Xiaochun
    Fan, Dongrui
    [J]. INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2021, 31 (11N12) : 1561 - 1581