Generating Large-Scale Heterogeneous Graphs for Benchmarking

被引:0
|
作者
Gupta, Amarnath [1 ]
机构
[1] Univ Calif San Diego, San Diego Supercomp Ctr, La Jolla, CA 92093 USA
来源
关键词
heterogeneous graph; benchmarking; power law; community structure; data generation;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Graphs have emerged as an important genre of data that are found in a wide class of applications. The most dominant benchmark for graph data today is Graph 500 that generates a Stochastic Kronecker graph of various sizes, and reports the time to perform a breadth-first search. Apache Giraph uses Pagerank computation as an algorithmic benchmark for large graphs, but does not provide the mechanism to generate graph data. Other forms of graph benchmarks have been developed by smaller communities and are not known widely. However, most benchmarking data for graphs are derived from a single structure generation model, and therefore does not capture the variability of structure and content. To this end, we propose heterogeneous graphs, a mixed model graph structure that combines several existing generation techniques into a single benchmark. It is a hybrid that constructs edge-labeled multigraphs with multiple components, which can be hierarchical, power-law graphs, community-forming graphs, and a new class of graphs formed by motif composition. The user can use a simple set of 4 parameters to specify the graph, but has the option to use several more parameters to have a finer control of the hybrid structure. We define the generation process for heterogeneous graphs and propose an initial set of query operations against the generated data.
引用
收藏
页码:113 / 128
页数:16
相关论文
共 50 条
  • [1] Algorithms for generating large-scale clustered random graphs
    Wang, Cheng
    Lizardo, Omar
    Hachen, David
    [J]. NETWORK SCIENCE, 2014, 2 (03) : 403 - 415
  • [2] Constructing the Three Graphs for the Large-Scale Heterogeneous Information System
    Li, Bing
    [J]. 2016 IEEE 2ND INTERNATIONAL CONFERENCE ON COLLABORATION AND INTERNET COMPUTING (IEEE CIC), 2016, : 292 - 303
  • [3] OAG: Toward Linking Large-scale Heterogeneous Entity Graphs
    Zhang, Fanjin
    Liu, Xiao
    Tang, Jie
    Dong, Yuxiao
    Yao, Peiran
    Zhang, Jie
    Gu, Xiaotao
    Wang, Yan
    Shao, Bin
    Li, Rui
    Wang, Kuansan
    [J]. KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, : 2585 - 2595
  • [4] Routing the Social Graphs for the Large-Scale Heterogeneous Information Accessing
    Li, Bing
    [J]. 2016 SIXTH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST), 2016, : 122 - 131
  • [5] An Efficient and Scalable Algorithmic Method for Generating Large-Scale Random Graphs
    Alam, Maksudul
    Khan, Maleq
    Vullikanti, Anil
    Marathe, Madhav
    [J]. SC '16: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2016, : 372 - 383
  • [6] An Efficient Subgraph-Inferring Framework for Large-Scale Heterogeneous Graphs
    Zhou, Wei
    Huang, Hong
    Shi, Ruize
    Yin, Kehan
    Jin, Hai
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 8, 2024, : 9431 - 9439
  • [7] OAG: Linking Entities Across Large-Scale Heterogeneous Knowledge Graphs
    Zhang, Fanjin
    Liu, Xiao
    Tang, Jie
    Dong, Yuxiao
    Yao, Peiran
    Zhang, Jie
    Gu, Xiaotao
    Wang, Yan
    Kharlamov, Evgeny
    Shao, Bin
    Li, Rui
    Wang, Kuansan
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (09) : 9225 - 9239
  • [8] Benchmarking for, large-scale placement and beyond
    Adya, AN
    Yildiz, MC
    Markov, IL
    Villarrubia, PG
    Parakh, PN
    Madden, PH
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2004, 23 (04) : 472 - 487
  • [9] Benchmarking Large-scale Object Storage Servers
    Lee, Jaemyoun
    Song, Chang
    Kang, Kyungtae
    [J]. PROCEEDINGS 2016 IEEE 40TH ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE WORKSHOPS (COMPSAC), VOL 2, 2016, : 594 - 595
  • [10] A methodology for scientific benchmarking with large-scale applications
    Armstrong, B
    Eigenmann, R
    [J]. PERFORMANCE EVALUATION AND BENCHMARKING WITH REALISTIC APPLICATIONS, 2001, : 109 - 127