Global triangle estimation based on first edge sampling in large graph streams

被引:1
|
作者
Yu, Changyong [1 ]
Liu, Huimin [1 ]
Wahab, Fazal [1 ]
Ling, Zihan [1 ]
Ren, Tianmei [1 ]
Ma, Haitao [1 ]
Zhao, Yuhai [1 ]
机构
[1] Northeastern Univ, Coll Comp Sci & Engn, Shenyang 110819, Liaoning, Peoples R China
来源
JOURNAL OF SUPERCOMPUTING | 2023年 / 79卷 / 13期
基金
中国国家自然科学基金;
关键词
Graph stream; Triangle counting; First-edge sampling; Probability and statistics; COUNTS;
D O I
10.1007/s11227-023-05205-3
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Triangle approximate counting has emerged as a prominent issue in graph stream research in the past few years, with applications ranging from social network analy-sis to web topic mining and motif detection in informatics. Many graph stream sam-pling and triangle approximate counting algorithms have been proposed, with the majority of them guaranteeing unbiased estimation. However, they either cannot ensure that the memory overhead or the result's uncertainty is too great due to the use of an excessively large sampling space. In this article, we propose RFES, a set of one-pass stream algorithms for counting the global number of triangles in a fully dynamic graph stream in an unbiased, low-variance, and high-precision manner. RFES has three algorithms: RFESBASE, RFES-IMPR, and RFES-FD, which rep-resent the basic, improved, and complete dynamic versions, respectively. Each algo-rithm is based on our proposed first-edge reservoir sampling method, which shrinks the sampling space while increasing the uncertainty of triangles in the sample. It can deal with fully dynamic data with a lower theoretical estimation variance than state-of-the-art algorithms. A significant number of experimental results demonstrated that our RFES algorithm is more accurate and takes less time. The source codes of RFES can be downloaded from the website: https://github.com/BioLab310/RFES.
引用
收藏
页码:14079 / 14116
页数:38
相关论文
共 50 条
  • [1] Global triangle estimation based on first edge sampling in large graph streams
    Changyong Yu
    Huimin Liu
    Fazal Wahab
    Zihan Ling
    Tianmei Ren
    Haitao Ma
    Yuhai Zhao
    [J]. The Journal of Supercomputing, 2023, 79 : 14079 - 14116
  • [2] Reservoir-based sampling over large graph streams to estimate triangle counts and node degrees
    Zhang, Lingling
    Jiang, Hong
    Wang, Fang
    Feng, Dan
    Xie, Yanwen
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 108 : 244 - 255
  • [3] Improved Triangle Counting in Graph Streams: Power of Multi-Sampling
    Kavassery-Parakkat, Neeraj
    Hanjani, Kiana Mousavi
    Pavan, A.
    [J]. 2018 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM), 2018, : 33 - 40
  • [4] PES: Priority Edge Sampling in Streaming Triangle Estimation
    Etemadi, Roohollah
    Lu, Jianguo
    [J]. IEEE TRANSACTIONS ON BIG DATA, 2022, 8 (02) : 470 - 481
  • [5] Edge Hashing Distributed Sampling Algorithm for Triangle Counting in Large-scale Dynamic Graph Stream
    He, Yulin
    Wu, Bo
    Wu, Dingming
    Huang, Zhexue
    Philippe, Fournier-Viger
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2024, 61 (08): : 1882 - 1903
  • [6] WRS: Waiting Room Sampling for Accurate Triangle Counting in Real Graph Streams
    Shin, Kijung
    [J]. 2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2017, : 1087 - 1092
  • [7] A distributed streaming framework for edge-cloud triangle counting in graph streams
    Yang, Xu
    Song, Chao
    Gu, Jiqing
    Li, Ke
    Li, Hongwei
    [J]. KNOWLEDGE-BASED SYSTEMS, 2023, 278
  • [8] Edge-Based Wedge Sampling to Estimate Triangle Counts in Very Large Graphs
    Turkoglu, Duru
    Turk, Ata
    [J]. 2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2017, : 455 - 464
  • [9] Temporal locality-aware sampling for accurate triangle counting in real graph streams
    Lee, Dongjin
    Shin, Kijung
    Faloutsos, Christos
    [J]. VLDB JOURNAL, 2020, 29 (06): : 1501 - 1525
  • [10] Temporal locality-aware sampling for accurate triangle counting in real graph streams
    Dongjin Lee
    Kijung Shin
    Christos Faloutsos
    [J]. The VLDB Journal, 2020, 29 : 1501 - 1525