Efficient Estimation of Triangles in Very Large Graphs

被引:9
|
作者
Etemadi, Roohollah [1 ]
Lu, Jianguo [1 ]
Tsin, Yung H. [1 ]
机构
[1] Univ Windsor, Sch Comp Sci, Windsor, ON N9B 3P4, Canada
关键词
Graph Sampling; Estimation; Triangles; Graph Algorithms; Clustering Coefficient;
D O I
10.1145/2983323.2983849
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The number of triangles in a graph is an important metric for understanding the graph. It is also directly related to the clustering coefficient of a graph, which is one of the most important indicator for social networks. Counting the number of triangles is computationally expensive for very large graphs. Hence, estimation is necessary for large graphs, particularly for graphs that are hidden behind searchable interfaces where the graphs in their entirety are not available. For instance, user networks in Twitter and Facebook are not available for third parties to explore their properties directly. This paper proposes a new method to estimate the number of triangles based on random edge sampling. It improves the traditional random edge sampling by probing the edges that have a higher probability of forming triangles. The method outperforms the traditional method consistently, and can be better by orders of magnitude when the graph is very large. The result is demonstrated on 20 graphs, including the largest graphs we can find. More importantly, we proved the improvement ratio, and verified our result on all the datasets. The analytical results are achieved by simplifying the variances of the estimators based on the assumption that the graph is very large. We believe that such big data assumption can lead to interesting results not only in triangle estimation, but also in other sampling problems.
引用
收藏
页码:1251 / 1260
页数:10
相关论文
共 50 条
  • [41] Triangles in inverse NSSD graphs
    Farrugia, Alexander
    Sciriha, Irene
    LINEAR & MULTILINEAR ALGEBRA, 2018, 66 (03): : 540 - 546
  • [42] Packing and covering triangles in graphs
    Haxell, PE
    DISCRETE MATHEMATICS, 1999, 195 (1-3) : 251 - 254
  • [43] Efficient coloring of a large spectrum of graphs
    Kirovski, D
    Potkonjak, M
    1998 DESIGN AUTOMATION CONFERENCE, PROCEEDINGS, 1998, : 427 - 432
  • [44] Efficient Pruning of Large Knowledge Graphs
    Faralli, Stefano
    Finocchi, Irene
    Ponzetto, Simone Paolo
    Velardi, Paola
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 4055 - 4063
  • [45] SEARCHING CONNECTED COMPONENTS IN VERY LARGE GRID GRAPHS
    ABRAMOWSKI, S
    MULLER, H
    LECTURE NOTES IN COMPUTER SCIENCE, 1987, 246 : 118 - 130
  • [46] Scalable Join Processing on Very Large RDF Graphs
    Neumann, Thomas
    Weikum, Gerhard
    ACM SIGMOD/PODS 2009 CONFERENCE, 2009, : 627 - 639
  • [47] A Reduction based Method for Coloring Very Large Graphs
    Lin, Jinkun
    Cai, Shaowei
    Luo, Chuan
    Su, Kaile
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 517 - 523
  • [48] Graphon Filters: Signal Processing in Very Large Graphs
    Ruiz, Luana
    Chamon, Luiz F. O.
    Ribeiro, Alejandro
    28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 1050 - 1054
  • [49] Visualizing very large graphs using clustering neighborhoods
    Mladenic, D
    Grobelnik, M
    LOCAL PATTERN DETECTION, 2005, 3539 : 89 - 97
  • [50] Packing triangles in low degree graphs and indifference graphs
    Manic, Gordana
    Wakabayashi, Yoshiko
    DISCRETE MATHEMATICS, 2008, 308 (08) : 1455 - 1471