Generating Large-Scale Heterogeneous Graphs for Benchmarking

被引:0
|
作者
Gupta, Amarnath [1 ]
机构
[1] Univ Calif San Diego, San Diego Supercomp Ctr, La Jolla, CA 92093 USA
来源
关键词
heterogeneous graph; benchmarking; power law; community structure; data generation;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Graphs have emerged as an important genre of data that are found in a wide class of applications. The most dominant benchmark for graph data today is Graph 500 that generates a Stochastic Kronecker graph of various sizes, and reports the time to perform a breadth-first search. Apache Giraph uses Pagerank computation as an algorithmic benchmark for large graphs, but does not provide the mechanism to generate graph data. Other forms of graph benchmarks have been developed by smaller communities and are not known widely. However, most benchmarking data for graphs are derived from a single structure generation model, and therefore does not capture the variability of structure and content. To this end, we propose heterogeneous graphs, a mixed model graph structure that combines several existing generation techniques into a single benchmark. It is a hybrid that constructs edge-labeled multigraphs with multiple components, which can be hierarchical, power-law graphs, community-forming graphs, and a new class of graphs formed by motif composition. The user can use a simple set of 4 parameters to specify the graph, but has the option to use several more parameters to have a finer control of the hybrid structure. We define the generation process for heterogeneous graphs and propose an initial set of query operations against the generated data.
引用
收藏
页码:113 / 128
页数:16
相关论文
共 50 条
  • [31] Gaussian Embedding of Large-Scale Attributed Graphs
    Hettige, Bhagya
    Li, Yuan-Fang
    Wang, Weiqing
    Buntine, Wray
    [J]. DATABASES THEORY AND APPLICATIONS, ADC 2020, 2020, 12008 : 134 - 146
  • [32] Managing Large-Scale Heterogeneous Deployments for Cybersecurity
    Hurley, J. S.
    [J]. PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON CYBER WARFARE AND SECURITY ICCWS, 2023, : 145 - 151
  • [33] Performance Prediction for Large-scale Heterogeneous Platforms
    Yasudo, Ryota
    Varbanescu, Ana L.
    Coutinho, Jose G. F.
    Luk, Wayne
    Amano, Hideharu
    [J]. PROCEEDINGS 26TH IEEE ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2018), 2018, : 220 - 220
  • [34] Relevance Measure in Large-Scale Heterogeneous Networks
    Meng, Xiaofeng
    Shi, Chuan
    Li, Yitong
    Zhang, Lei
    Wu, Bin
    [J]. WEB TECHNOLOGIES AND APPLICATIONS, APWEB 2014, 2014, 8709 : 636 - 643
  • [35] Advanced learning for large-scale heterogeneous computing
    Zou, Quan
    Liu, Wei
    Merler, Michele
    Ji, Rongrong
    [J]. NEUROCOMPUTING, 2016, 217 : 1 - 2
  • [36] Optimized localization in large-scale heterogeneous WSN
    Kumar, Sumit
    Batra, Neera
    Kumar, Shrawan
    [J]. JOURNAL OF SUPERCOMPUTING, 2023, 79 (06): : 6705 - 6729
  • [37] Optimized localization in large-scale heterogeneous WSN
    Sumit Kumar
    Neera Batra
    Shrawan Kumar
    [J]. The Journal of Supercomputing, 2023, 79 : 6705 - 6729
  • [38] Load balancing in large-scale heterogeneous systems
    Sem Borst
    [J]. Queueing Systems, 2022, 100 : 397 - 399
  • [39] Load balancing in large-scale heterogeneous systems
    Borst, Sem
    [J]. QUEUEING SYSTEMS, 2022, 100 (3-4) : 397 - 399
  • [40] OpenBioLink: a benchmarking framework for large-scale biomedical link prediction
    Breit, Anna
    Ott, Simon
    Agibetov, Asan
    Samwald, Matthias
    [J]. BIOINFORMATICS, 2020, 36 (13) : 4097 - 4098