Datamime: Generating Representative Benchmarks by Automatically Synthesizing Datasets

被引:1
|
作者
Lee, Hyun Ryong [1 ]
Sanchez, Daniel [1 ]
机构
[1] MIT, Cambridge, MA 02139 USA
关键词
benchmarking; workload generation; OPTIMIZATION;
D O I
10.1109/MICRO56248.2022.00082
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Benchmarks that closely match the behavior of production workloads are crucial to design and provision computer systems. However, current approaches fall short: First, open-source benchmarks use public datasets that cause different behavior from production workloads. Second, black-box workload cloning techniques generate synthetic code that imitates the target workload, but the resulting program fails to capture most workload characteristics, such as microarchitectural bottlenecks or time-varying behavior. Generating code that mimics a complex application is an extremely hard problem. Instead, we propose a different and easier approach to benchmark synthesis. Our key insight is that, for many production workloads, the program is publicly available or there is a reasonably similar open-source program. In this case, generating the right dataset is sufficient to produce an accurate benchmark. Based on this observation, we present Datamime, a profile-guided approach to generate representative benchmarks for production workloads. Datamime uses the performance profiles of a target workload to generate a dataset that, when used by a benchmark program, behaves very similarly to the target workload in terms of its microarchitectural characteristics. We evaluate Datamime on several datacenter workloads. Datamime generates synthetic benchmarks that closely match the microarchitectural features of these workloads, with a mean absolute percentage error of 3.2% on IPC. Microarchitectural behavior stays close across processor types. Finally, time-varying behaviors are also replicated, making these benchmarks useful to e.g. characterize and optimize tail latency.
引用
收藏
页码:1144 / 1159
页数:16
相关论文
共 50 条
  • [1] Genesys: Automatically Generating Representative Training Sets for Predictive Benchmarking
    Panda, Reena
    Zheng, Xinnian
    Song, Shuang
    Ryoo, Jee Ho
    LeBeane, Michael
    Gerstlauer, Andreas
    John, Lizy K.
    [J]. PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING AND SIMULATION (SAMOS), 2016, : 116 - 123
  • [2] Synthesizing Benchmarks for Predictive Modeling
    Cummins, Chris
    Petoumenos, Pavlos
    Wang, Zheng
    Leather, Hugh
    [J]. CGO'17: PROCEEDINGS OF THE 2017 INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION, 2017, : 86 - 99
  • [3] Synthesizing Complementary Circuits Automatically
    Shen, ShengYu
    Qin, Ying
    Wang, KeFei
    Xiao, LiQuan
    Zhang, JianMin
    Li, SiKun
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2010, 29 (08) : 1191 - 1202
  • [4] Synthesizing Datasets for Pervasive Spaces
    Mendez-Vazquez, Andres
    Helal, Abdelsalam
    Cook, Diane Joyce
    [J]. INTELLIGENT ENVIRONMENTS 2009, 2009, 2 : 303 - +
  • [5] A Case Study on Machine Learning for Synthesizing Benchmarks
    Goens, Andres
    Brauckmann, Alexander
    Ertel, Sebastian
    Cummins, Chris
    Leather, Hugh
    Castrillon, Jeronimo
    [J]. PROCEEDINGS OF THE 3RD ACM SIGPLAN INTERNATIONAL WORKSHOP ON MACHINE LEARNING AND PROGRAMMING LANGUAGES (MAPL '19), 2019, : 38 - 46
  • [6] Small representative benchmarks for thermochemical calculations
    Lynch, BJ
    Truhlar, DG
    [J]. JOURNAL OF PHYSICAL CHEMISTRY A, 2003, 107 (42): : 8996 - 8999
  • [7] AUTOMATICALLY GENERATING ABSTRACTIONS FOR PLANNING
    KNOBLOCK, CA
    [J]. ARTIFICIAL INTELLIGENCE, 1994, 68 (02) : 243 - 302
  • [8] Automatically annotating and integrating spatial datasets
    Chen, CC
    Thakkar, S
    Knoblock, C
    Shahabi, C
    [J]. ADVANCES IN SPATIAL AND TEMPORAL DATABASES, PROCEEDINGS, 2003, 2750 : 469 - 488
  • [9] Automatically generating Construction Diary
    不详
    [J]. BAUINGENIEUR, 2018, 93 : A35 - A35
  • [10] Automatically Generating Models of IT Systems
    Kovacevic, Ivan
    Gros, Stjepan
    Derek, Ante
    [J]. IEEE ACCESS, 2022, 10 : 13536 - 13554