Cloud Analytics Benchmark

被引:1
|
作者
Van Renen, Alexander [1 ]
Leis, Viktor [2 ]
机构
[1] Friedrich Alexander Univ Erlangen Nurnberg, Erlangen, Germany
[2] Tech Univ Munich, Munich, Germany
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2023年 / 16卷 / 06期
基金
欧洲研究理事会;
关键词
COST;
D O I
10.14778/3583140.3583156
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The cloud facilitates the transition to a service-oriented perspective. This affects cloud-native data management in general, and data analytics in particular. Instead of managing a multi-node database cluster on-premise, end users simply send queries to a managed cloud data warehouse and receive results. While this is obviously very attractive for end users, database system architects still have to engineer systems for this new service model. There are currently many competing architectures ranging from self-hosted (Presto, PostgreSQL), over managed (Snowflake, Amazon Redshift) to query-as-a-service (Amazon Athena, Google BigQuery) offerings. Benchmarking these architectural approaches is currently difficult, and it is not even clear what the metrics for a comparison should be. To overcome these challenges, we first analyze a real-world query trace from Snowflake and compare its properties to that of TPC-H and TPC-DS. Doing so, we identify important differences that distinguish traditional benchmarks from real-world cloud data warehouse workloads. Based on this analysis, we propose the Cloud Analytics Benchmark (CAB). By incorporating workload fluctuations and multi-tenancy, CAB allows evaluating different designs in terms of user-centered metrics such as cost and performance.
引用
收藏
页码:1413 / 1425
页数:13
相关论文
共 50 条
  • [1] Benchmark for anonymous video analytics
    Ricardo Sanchez-Matilla
    Andrea Cavallaro
    [J]. EURASIP Journal on Image and Video Processing, 2021
  • [2] Benchmark for anonymous video analytics
    Sanchez-Matilla, Ricardo
    Cavallaro, Andrea
    [J]. EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2021, 2021 (01)
  • [3] IoTAbench: An internet of things analytics benchmark
    [J]. 1600, Hewlett Packard Laboratories
  • [4] GenBase: A Complex Analytics Genomics Benchmark
    Taft, Rebecca
    Vartak, Manasi
    Satish, Nadathur Rajagopalan
    Sundaram, Narayanan
    Madden, Samuel
    Stonebraker, Michael
    [J]. SIGMOD'14: PROCEEDINGS OF THE 2014 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2014, : 177 - 188
  • [5] ViStA: Video Streaming and Analytics Benchmark
    Raju, Navneet
    Koushik, Rahul M.
    Om, Hari
    Kalambur, Subramaniam
    [J]. 2021 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS 2021), 2021, : 73 - 75
  • [6] Cloud Analytics gestalten
    Frank Bensberg
    Nicole Schirm
    [J]. Controlling & Management Review, 2018, 62 (5) : 60 - 65
  • [7] Data Analytics Algorithm Benchmark on Distributed Systems
    Hamid, Mohd Hakim Abdul
    Abu, Nur Azman
    Mohamad, Siti Nurul Mahfuzah
    Idris, Ariff
    Zakaria, Zahriladha
    Sulaiman, Zuraidah
    [J]. PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON APPLIED SCIENCE AND TECHNOLOGY (ICAST'18), 2018, 2016
  • [8] Testing of big data analytics systems by benchmark
    Chen, Mingang
    Chen, Wenjie
    Cai, Lizhi
    [J]. 2018 IEEE 11TH INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION WORKSHOPS (ICSTW), 2018, : 231 - 238
  • [9] Throughput Analytics of Cloud Networks
    Phanekham, Derek
    Nair, Suku
    Rao, Nageswara S. V.
    Truty, Mike
    [J]. 2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 5553 - 5558
  • [10] Online Risk Analytics on the Cloud
    Kim, Hyunjoo
    Chaudhari, Shivangi
    Parashar, Manish
    Marty, Christopher
    [J]. CCGRID: 2009 9TH IEEE INTERNATIONAL SYMPOSIUM ON CLUSTER COMPUTING AND THE GRID, 2009, : 484 - +