Discussion of BigBench: A Proposed Industry Standard Performance Benchmark for Big Data

被引:12
|
作者
Baru, Chaitanya [11 ]
Bhandarkar, Milind [10 ]
Curino, Carlo [7 ]
Danisch, Manuel [1 ]
Frank, Michael [1 ]
Gowda, Bhaskar [6 ]
Jacobsen, Hans-Arno [8 ]
Jie, Huang [6 ]
Kumar, Dileep [3 ]
Nambiar, Raghunath [2 ]
Poess, Meikel [9 ]
Raab, Francois [5 ]
Rabl, Tilmann [1 ]
Ravi, Nishkam [3 ]
Sachs, Kai [12 ]
Sen, Saptak [4 ]
Yi, Lan [6 ]
Youn, Choonhan [11 ]
机构
[1] Bankmark, Passau, Germany
[2] Cisco Syst, San Jose, CA USA
[3] Cloudera, Palo Alto, CA USA
[4] Hortonworks, Santa Clara, CA USA
[5] Infosizing, Manitou Springs, CO USA
[6] Intel Corp, Santa Clara, CA USA
[7] Microsoft Corp, Redmond, WA 98052 USA
[8] Middleware Syst Res Grp, Toronto, ON, Canada
[9] Oracle Corp, Redwood City, CA USA
[10] Pivotal, Vancouver, BC, Canada
[11] San Diego Supercomp Ctr, La Jolla, CA USA
[12] SPEC Res Grp, Gainesville, FL USA
关键词
D O I
10.1007/978-3-319-15350-6_4
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Enterprises perceive a huge opportunity in mining information that can be found in big data. New storage systems and processing paradigms are allowing for ever larger data sets to be collected and analyzed. The high demand for data analytics and rapid development in technologies has led to a sizable ecosystem of big data processing systems. However, the lack of established, standardized benchmarks makes it difficult for users to choose the appropriate systems that suit their requirements. To address this problem, we have developed the BigBench benchmark specification. BigBench is the first end-to-end big data analytics benchmark suite. In this paper, we present the BigBench benchmark and analyze the workload from technical as well as business point of view. We characterize the queries in the workload along different dimensions, according to their functional characteristics, and also analyze their runtime behavior. Finally, we evaluate the suitability and relevance of the workload from the point of view of enterprise applications, and discuss potential extensions to the proposed specification in order to cover typical big data processing use cases.
引用
收藏
页码:44 / 63
页数:20
相关论文
共 50 条
  • [1] BigBench Specification V0.1 BigBench: An Industry Standard Benchmark for Big Data Analytics
    Rabl, Tilmann
    Ghazal, Ahmad
    Hu, Minqing
    Crolotte, Alain
    Raab, Francois
    Poess, Meikel
    Jacobsen, Hans-Arno
    SPECIFYING BIG DATA BENCHMARKS, 2014, 8163 : 164 - 201
  • [2] From BigBench to TPCx-BB: Standardization of a Big Data Benchmark
    Cao, Paul
    Gowda, Bhaskar
    Lakshmi, Seetha
    Narasimhadevara, Chinmayi
    Nguyen, Patrick
    Poelman, John
    Poess, Meikel
    Rabl, Tilmann
    PERFORMANCE EVALUATION AND BENCHMARKING: TRADITIONAL - BIG DATA - INTERNET OF THINGS, TPCTC 2016, 2017, 10080 : 24 - 44
  • [3] TPC's Benchmark Development Model: Making the First Industry Standard Benchmark on Big Data a Success
    Poess, Meikel
    SPECIFYING BIG DATA BENCHMARKS, 2014, 8163 : 1 - 10
  • [4] MLPerf: An Industry Standard Benchmark Suite for Machine Learning Performance
    Mattson, Peter
    Tang, Hanlin
    Wei, Gu-Yeon
    Wu, Carole-Jean
    Reddi, Vijay Janapa
    Cheng, Christine
    Coleman, Cody
    Diamos, Greg
    Kanter, David
    Micikevicius, Paulius
    Patterson, David
    Schmuelling, Guenther
    IEEE MICRO, 2020, 40 (02) : 8 - 16
  • [5] DEVELOPMENT OF PROPOSED INDUSTRY STANDARD PROCEDURES FOR MEASURING TV RECEIVER PERFORMANCE
    BELL, JF
    IEEE TRANSACTIONS ON BROADCAST AND TELEVISION RECEIVERS, 1974, BT20 (01): : 1 - 5
  • [6] The impact of big data on firm performance in hotel industry
    Yadegaridehkordi, Elaheh
    Nilashi, Mehrbakhsh
    Shuib, Liyana
    Nasir, Mohd Hairul Nizam Bin Md
    Asadi, Shahla
    Samad, Sarminah
    Awang, Nor Fatimah
    ELECTRONIC COMMERCE RESEARCH AND APPLICATIONS, 2020, 40
  • [7] Amdahl's Law in Big Data Analytics: Alive and Kicking in TPCx-BB (BigBench)
    Richins, Daniel
    Ahmed, Tahrina
    Clapp, Russell
    Reddi, Vijay Janapa
    2018 24TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2018, : 630 - 642
  • [8] User-Centric Approach for Benchmark RDF Data Generator in Big Data Performance Analysis
    Purohit, Sumit
    Paulson, Patrick
    Rodriguez, Luke
    2016 IEEE TENTH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2016, : 178 - 179
  • [9] Big Data in industry
    Latinovic, T. S.
    Preradovic, D. M.
    Barz, C. R.
    Latinovic, M. T.
    Petrica, P. P.
    Pop-Vadean, A.
    INTERNATIONAL CONFERENCE ON INNOVATIVE IDEAS IN SCIENCE (IIS2015), 2016, 144
  • [10] PROPOSED INDUSTRY-STANDARD BEAM CONNECTION
    BIRKELAND, PW
    JOURNAL PRESTRESSED CONCRETE INSTITUTE, 1971, 16 (01): : 38 - +