Discussion of BigBench: A Proposed Industry Standard Performance Benchmark for Big Data

被引:12
|
作者
Baru, Chaitanya [11 ]
Bhandarkar, Milind [10 ]
Curino, Carlo [7 ]
Danisch, Manuel [1 ]
Frank, Michael [1 ]
Gowda, Bhaskar [6 ]
Jacobsen, Hans-Arno [8 ]
Jie, Huang [6 ]
Kumar, Dileep [3 ]
Nambiar, Raghunath [2 ]
Poess, Meikel [9 ]
Raab, Francois [5 ]
Rabl, Tilmann [1 ]
Ravi, Nishkam [3 ]
Sachs, Kai [12 ]
Sen, Saptak [4 ]
Yi, Lan [6 ]
Youn, Choonhan [11 ]
机构
[1] Bankmark, Passau, Germany
[2] Cisco Syst, San Jose, CA USA
[3] Cloudera, Palo Alto, CA USA
[4] Hortonworks, Santa Clara, CA USA
[5] Infosizing, Manitou Springs, CO USA
[6] Intel Corp, Santa Clara, CA USA
[7] Microsoft Corp, Redmond, WA 98052 USA
[8] Middleware Syst Res Grp, Toronto, ON, Canada
[9] Oracle Corp, Redwood City, CA USA
[10] Pivotal, Vancouver, BC, Canada
[11] San Diego Supercomp Ctr, La Jolla, CA USA
[12] SPEC Res Grp, Gainesville, FL USA
来源
PERFORMANCE CHARACTERIZATION AND BENCHMARKING: TRADITIONAL TO BIG DATA | 2015年 / 8904卷
关键词
D O I
10.1007/978-3-319-15350-6_4
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Enterprises perceive a huge opportunity in mining information that can be found in big data. New storage systems and processing paradigms are allowing for ever larger data sets to be collected and analyzed. The high demand for data analytics and rapid development in technologies has led to a sizable ecosystem of big data processing systems. However, the lack of established, standardized benchmarks makes it difficult for users to choose the appropriate systems that suit their requirements. To address this problem, we have developed the BigBench benchmark specification. BigBench is the first end-to-end big data analytics benchmark suite. In this paper, we present the BigBench benchmark and analyze the workload from technical as well as business point of view. We characterize the queries in the workload along different dimensions, according to their functional characteristics, and also analyze their runtime behavior. Finally, we evaluate the suitability and relevance of the workload from the point of view of enterprise applications, and discuss potential extensions to the proposed specification in order to cover typical big data processing use cases.
引用
收藏
页码:44 / 63
页数:20
相关论文
共 50 条
  • [21] Big data in fashion industry
    Jain, S.
    Bruniaux, J.
    Zeng, X.
    Bruniaux, P.
    17TH WORLD TEXTILE CONFERENCE AUTEX 2017 - SHAPING THE FUTURE OF TEXTILES, 2017, 254
  • [22] THE BIOPHARMACEUTICAL INDUSTRY AND BIG DATA
    Allen, Albert J.
    JOURNAL OF THE AMERICAN ACADEMY OF CHILD AND ADOLESCENT PSYCHIATRY, 2016, 55 (10): : S308 - S309
  • [23] Big Data in Tourism Industry
    Shafiee, Sanaz
    Ghatari, Ali Rajabzadeh
    2016 10TH INTERNATIONAL CONFERENCE ON E-COMMERCE IN DEVELOPING COUNTRIES: WITH FOCUS ON E-TOURISM (ECDC), 2016,
  • [24] Big Data in the Maritime Industry
    Mirovic, Maris
    Milicevic, Mario
    Obradovic, Ines
    NASE MORE, 2018, 65 (01): : 56 - 62
  • [25] Big data and travel industry
    Bulgakov, A. L.
    FINANCIAL AND ECONOMIC TOOLS USED IN THE WORLD HOSPITALITY INDUSTRY, 2018, : 265 - 270
  • [26] A Big Data Case Study in Digital Humanities: Creating a Performance Benchmark for Canonical Text Services
    Heyer, Gerhard
    Tiepmar, Jochen
    Datenbank-Spektrum, 2019, 19 (01): : 41 - 49
  • [27] Reducing Structured Big Data Benchmark Cycle Time using Query Performance Prediction Model
    Singhal, Rekha
    2015 International Conference on Computing, Communication and Security (ICCCS), 2015,
  • [28] Big Data Architectures Benchmark for Forecasting Electricity Consumption
    2020, Institute of Electrical and Electronics Engineers Inc.
  • [29] The Suitability of Graph Databases for Big Data Analysis: A Benchmark
    Macak, Martin
    Stovcik, Matus
    Buhnova, Barbora
    PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON INTERNET OF THINGS, BIG DATA AND SECURITY (IOTBDS), 2020, : 213 - 220
  • [30] Big Data Architectures Benchmark for Forecasting Electricity Consumption
    Daki, Houda
    El Hannani, Asmaa
    Ouahmane, Hassan
    PROCEEDINGS OF 2020 5TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND ARTIFICIAL INTELLIGENCE: TECHNOLOGIES AND APPLICATIONS (CLOUDTECH'20), 2020, : 157 - 162