From BigBench to TPCx-BB: Standardization of a Big Data Benchmark

被引:0
|
作者
Cao, Paul [1 ]
Gowda, Bhaskar [2 ]
Lakshmi, Seetha [3 ]
Narasimhadevara, Chinmayi [4 ]
Nguyen, Patrick [5 ]
Poelman, John [6 ]
Poess, Meikel [7 ]
Rabl, Tilmann [8 ,9 ]
机构
[1] Hewlett Packard Enterprise, Palo Alto, CA USA
[2] Intel Corp, Hillsboro, OR 97124 USA
[3] Actian Corp, Palo Alto, CA USA
[4] Cisco Syst Inc, San Jose, CA USA
[5] Microsoft Corp, Redmond, WA 98052 USA
[6] IBM Corp, San Jose, CA USA
[7] Oracle Corp, Redwood City, CA USA
[8] Tech Univ Berlin, Berlin, Germany
[9] DFKI GmbH, Berlin, Germany
来源
PERFORMANCE EVALUATION AND BENCHMARKING: TRADITIONAL - BIG DATA - INTERNET OF THINGS, TPCTC 2016 | 2017年 / 10080卷
关键词
SCALE;
D O I
10.1007/978-3-319-54334-5_3
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
With the increased adoption of Hadoop-based big data systems for the analysis of large volume and variety of data, an effective and common benchmark for big data deployments is needed. There have been a number of proposals from industry and academia to address this challenge. While most either have basic workloads (e.g. word counting), or port existing benchmarks to big data systems (e.g. TPC-H or TPC-DS), some are specifically designed for big data challenges. The most comprehensive proposal among these is the BigBench benchmark, recently standardized by the Transaction Processing Performance Council as TPCx-BB. In this paper, we discuss the progress made since the original BigBench proposal to the standardized TPCx-BB. In addition, we will share the thought process went into creating the specification, challenges in navigating the uncharted territories of a complex benchmark for a fast moving technology domain, and analyze the functionality of the benchmark suite on different Hadoop- and non-Hadoop-based big data engines. We will provide insights on the first official result of TPCx-BB and finally discuss, in brief, other relevant and fast growing big data analytic use cases to be addressed in future big data benchmarks.
引用
收藏
页码:24 / 44
页数:21
相关论文
共 50 条
  • [21] Research on the standardization problem of database design based on big data
    Yue, Peng, 1600, TeknoScienze, Viale Brianza,22, Milano, 20127, Italy (28):
  • [22] Towards an Automatic Analyze and Standardization of Unstructured Data in the Context of Big and Linked Data
    Fadili, Hammou
    Jouis, Christophe
    PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON MANAGEMENT OF DIGITAL ECOSYSTEMS (MEDES 2016), 2016, : 223 - 230
  • [23] TPC's Benchmark Development Model: Making the First Industry Standard Benchmark on Big Data a Success
    Poess, Meikel
    SPECIFYING BIG DATA BENCHMARKS, 2014, 8163 : 1 - 10
  • [24] A New Benchmark in Benchmarking Big data and automation create opportunities and challenges
    Plumb, Steve
    MANUFACTURING ENGINEERING, 2024, 172 (04): : 58 - 62
  • [25] Benchmarking Big Data Systems: Introducing TPC Express Benchmark HS
    Nambiar, Raghunath
    BIG DATA BENCHMARKING, WBDB 2014, 2015, 8991 : 24 - 28
  • [26] ShenZhen transportation system (SZTS): a novel big data benchmark suite
    Wen Xiong
    Zhibin Yu
    Lieven Eeckhout
    Zhengdong Bei
    Fan Zhang
    Chengzhong Xu
    The Journal of Supercomputing, 2016, 72 : 4337 - 4364
  • [27] ShenZhen transportation system (SZTS): a novel big data benchmark suite
    Xiong, Wen
    Yu, Zhibin
    Eeckhout, Lieven
    Bei, Zhengdong
    Zhang, Fan
    Xu, Chengzhong
    JOURNAL OF SUPERCOMPUTING, 2016, 72 (11): : 4337 - 4364
  • [28] PRIMEBALL: A Parallel Processing Framework Benchmark for Big Data Applications in the Cloud
    Ferrarons, Jaume
    Adhana, Mulu
    Colmenares, Carlos
    Pietrowska, Sandra
    Bentayeb, Fadila
    Darmont, Jerome
    PERFORMANCE CHARACTERIZATION AND BENCHMARKING, 2014, 8391 : 109 - 124
  • [29] Genotype and phenotype data standardization, utilization and integration in the big data era for agricultural sciences
    Deng, Cecilia H.
    Naithani, Sushma
    Kumari, Sunita
    Cobo-Simon, Irene
    Quezada-Rodriguez, Elsa H.
    Skrabisova, Maria
    Gladman, Nick
    Correll, Melanie J.
    Sikiru, Akeem Babatunde
    Afuwape, Olusola O.
    Marrano, Annarita
    Rebollo, Ines
    Zhang, Wentao
    Jung, Sook
    DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2023, 2023
  • [30] From Big Data Technologies to Big Data Benefits
    Jensen, Maria Hoffmann
    Nielsen, Peter Axel
    Persson, John Stouby
    COMPUTER, 2023, 56 (06) : 52 - 61