Generating Large-Scale Heterogeneous Graphs for Benchmarking

被引:0
|
作者
Gupta, Amarnath [1 ]
机构
[1] Univ Calif San Diego, San Diego Supercomp Ctr, La Jolla, CA 92093 USA
来源
关键词
heterogeneous graph; benchmarking; power law; community structure; data generation;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Graphs have emerged as an important genre of data that are found in a wide class of applications. The most dominant benchmark for graph data today is Graph 500 that generates a Stochastic Kronecker graph of various sizes, and reports the time to perform a breadth-first search. Apache Giraph uses Pagerank computation as an algorithmic benchmark for large graphs, but does not provide the mechanism to generate graph data. Other forms of graph benchmarks have been developed by smaller communities and are not known widely. However, most benchmarking data for graphs are derived from a single structure generation model, and therefore does not capture the variability of structure and content. To this end, we propose heterogeneous graphs, a mixed model graph structure that combines several existing generation techniques into a single benchmark. It is a hybrid that constructs edge-labeled multigraphs with multiple components, which can be hierarchical, power-law graphs, community-forming graphs, and a new class of graphs formed by motif composition. The user can use a simple set of 4 parameters to specify the graph, but has the option to use several more parameters to have a finer control of the hybrid structure. We define the generation process for heterogeneous graphs and propose an initial set of query operations against the generated data.
引用
收藏
页码:113 / 128
页数:16
相关论文
共 50 条
  • [21] Group Centrality Maximization for Large-scale Graphs
    Angriman, Eugenio
    van der Grinten, Alexander
    Bojchevski, Aleksandar
    Zuegner, Daniel
    Guennemann, Stephan
    Meyerhenke, Henning
    [J]. 2020 PROCEEDINGS OF THE SYMPOSIUM ON ALGORITHM ENGINEERING AND EXPERIMENTS, ALENEX, 2020, : 56 - 69
  • [22] Structure-based Large-scale Dynamic Heterogeneous Graphs Processing: Applications, Challenges and Solutions
    Zhang, Wenjie
    [J]. COMPANION PROCEEDINGS OF THE WEB CONFERENCE 2022, WWW 2022 COMPANION, 2022, : 1006 - 1006
  • [23] Understanding Coarsening for Embedding Large-Scale Graphs
    Akyildiz, Taha Atahan
    Aljundi, Amro Alabsi
    Kaya, Kamer
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 2937 - 2946
  • [24] Efficient Machine Learning On Large-Scale Graphs
    Erickson, Parker
    Lee, Victor E.
    Shi, Feng
    Tang, Jiliang
    [J]. PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 4788 - 4789
  • [25] Readable representations for large-scale bipartite graphs
    Sato, Shuji
    Misue, Kazuo
    Tanaka, Jiro
    [J]. KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 2, PROCEEDINGS, 2008, 5178 : 831 - 838
  • [26] Large-scale Machine Learning over Graphs
    Yang, Yiming
    [J]. PROCEEDINGS OF THE 2018 ACM SIGIR INTERNATIONAL CONFERENCE ON THEORY OF INFORMATION RETRIEVAL (ICTIR'18), 2018, : 9 - 9
  • [27] Large-scale quantum networks based on graphs
    Epping, Michael
    Kampermann, Hermann
    Bruss, Dagmar
    [J]. NEW JOURNAL OF PHYSICS, 2016, 18
  • [28] Adaptive Partitioning of Large-Scale Dynamic Graphs
    Vaquero, Luis M.
    Cuadrado, Felix
    Logothetis, Dionysios
    Martella, Claudio
    [J]. 2014 IEEE 34TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2014), 2014, : 144 - 153
  • [29] Parallel generation of large-scale random graphs
    Vullikanti, Anil
    [J]. 2018 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2018), 2018, : 278 - 278
  • [30] Multilevel Parallelism for the Exploration of Large-Scale Graphs
    Bernaschi, Massimo
    Bisson, Mauro
    Mastrostefano, Enrico
    Vella, Flavio
    [J]. IEEE TRANSACTIONS ON MULTI-SCALE COMPUTING SYSTEMS, 2018, 4 (03): : 204 - 216