Generating Large-Scale Heterogeneous Graphs for Benchmarking

被引：0

作者：

Gupta, Amarnath ^{[1
]}

机构：

[1] Univ Calif San Diego, San Diego Supercomp Ctr, La Jolla, CA 92093 USA

来源：

SPECIFYING BIG DATA BENCHMARKS | 2014年 / 8163卷

关键词：

heterogeneous graph; benchmarking; power law; community structure; data generation;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Graphs have emerged as an important genre of data that are found in a wide class of applications. The most dominant benchmark for graph data today is Graph 500 that generates a Stochastic Kronecker graph of various sizes, and reports the time to perform a breadth-first search. Apache Giraph uses Pagerank computation as an algorithmic benchmark for large graphs, but does not provide the mechanism to generate graph data. Other forms of graph benchmarks have been developed by smaller communities and are not known widely. However, most benchmarking data for graphs are derived from a single structure generation model, and therefore does not capture the variability of structure and content. To this end, we propose heterogeneous graphs, a mixed model graph structure that combines several existing generation techniques into a single benchmark. It is a hybrid that constructs edge-labeled multigraphs with multiple components, which can be hierarchical, power-law graphs, community-forming graphs, and a new class of graphs formed by motif composition. The user can use a simple set of 4 parameters to specify the graph, but has the option to use several more parameters to have a finer control of the hybrid structure. We define the generation process for heterogeneous graphs and propose an initial set of query operations against the generated data.

引用

页码：113 / 128

页数：16

共 50 条

[1] Algorithms for generating large-scale clustered random graphs
Wang, Cheng
Lizardo, Omar
Hachen, David
[J]. NETWORK SCIENCE, 2014, 2 (03) : 403 - 415
[2] Constructing the Three Graphs for the Large-Scale Heterogeneous Information System
Li, Bing
[J]. 2016 IEEE 2ND INTERNATIONAL CONFERENCE ON COLLABORATION AND INTERNET COMPUTING (IEEE CIC), 2016, : 292 - 303
[3] OAG: Toward Linking Large-scale Heterogeneous Entity Graphs
Zhang, Fanjin
Liu, Xiao
Tang, Jie
Dong, Yuxiao
Yao, Peiran
Zhang, Jie
Gu, Xiaotao
Wang, Yan
Shao, Bin
Li, Rui
Wang, Kuansan
[J]. KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, : 2585 - 2595
[4] Routing the Social Graphs for the Large-Scale Heterogeneous Information Accessing
Li, Bing
[J]. 2016 SIXTH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST), 2016, : 122 - 131
[5] An Efficient and Scalable Algorithmic Method for Generating Large-Scale Random Graphs
Alam, Maksudul
Khan, Maleq
Vullikanti, Anil
Marathe, Madhav
[J]. SC '16: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2016, : 372 - 383
[6] An Efficient Subgraph-Inferring Framework for Large-Scale Heterogeneous Graphs
Zhou, Wei
Huang, Hong
Shi, Ruize
Yin, Kehan
Jin, Hai
[J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 8, 2024, : 9431 - 9439
[7] OAG: Linking Entities Across Large-Scale Heterogeneous Knowledge Graphs
Zhang, Fanjin
Liu, Xiao
Tang, Jie
Dong, Yuxiao
Yao, Peiran
Zhang, Jie
Gu, Xiaotao
Wang, Yan
Kharlamov, Evgeny
Shao, Bin
Li, Rui
Wang, Kuansan
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (09) : 9225 - 9239
[8] Benchmarking for, large-scale placement and beyond
Adya, AN
Yildiz, MC
Markov, IL
Villarrubia, PG
Parakh, PN
Madden, PH
[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2004, 23 (04) : 472 - 487
[9] Benchmarking Large-scale Object Storage Servers
Lee, Jaemyoun
Song, Chang
Kang, Kyungtae
[J]. PROCEEDINGS 2016 IEEE 40TH ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE WORKSHOPS (COMPSAC), VOL 2, 2016, : 594 - 595
[10] A methodology for scientific benchmarking with large-scale applications
Armstrong, B
Eigenmann, R
[J]. PERFORMANCE EVALUATION AND BENCHMARKING WITH REALISTIC APPLICATIONS, 2001, : 109 - 127

← 1 2 3 4 5 →