Big SQL systems: an experimental evaluation

被引:2
|
作者
Aluko, Victor [1 ]
Sakr, Sherif [1 ]
机构
[1] Univ Taru, Taru, Estonia
关键词
Big data; Big SQL; Benchmarking;
D O I
10.1007/s10586-019-02914-4
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, Big Data systems have been gaining increasing popularity on handling the massive amounts of data that are continuously generated in our digital world. While the Hadoop framework has pioneered the area of Big Data processing systems, it had clear performance limitations on providing the best performance of processing massive amounts of structured data. In addition, practically, many users of the big data systems face some challenges on dealing with the APIs and the low level programming abstractions of the Big Data System and they would prefer to use SQL (in which they are more proficient) as a high-level declarative language to express their tasks while leaving all of the execution optimization details to the backend engine. Thus, several systems have been designed and implemented to tackle these challenges by designing and implementing scalable query execution engines for processing massive structured data while supporting SQL interfaces. In this article, we present an extensive experimental study of four popular systems in this domain, namely, Apache Hive, SPARK SQL, Apache Impala and PrestoDB. In particular, we report and analyze the performance characteristics of these systems using three different benchmarks, namely, TPC-H, TPC-DS and TPCx-BB. Finally, we report a set of insights and important lessons that we have learned from conducting our experiments.
引用
下载
收藏
页码:1347 / 1377
页数:31
相关论文
共 50 条
  • [21] Adaptive Caching in Big SQL using the HDFS Cache
    Floratou, Avrilia
    Megiddo, Nimrod
    Potti, Navneet
    Ozcan, Fatma
    Kale, Uday
    Schmitz-Hermes, Jan
    PROCEEDINGS OF THE SEVENTH ACM SYMPOSIUM ON CLOUD COMPUTING (SOCC 2016), 2016, : 321 - 333
  • [22] EVALUATION OF ACTINOMYCINS IN EXPERIMENTAL SYSTEMS
    GOLDIN, A
    JOHNSON, RK
    CANCER CHEMOTHERAPY REPORTS PART 1, 1974, 58 (01): : 63 - 77
  • [23] The SQL File Evaluation (SQLFE) Tool: A Flexible and Extendible System for Evaluation of SQL Queries
    Wagner, Paul J.
    SIGCSE 2020: PROCEEDINGS OF THE 51ST ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, 2020, : 1334 - 1334
  • [24] Evaluation of Data Management Systems for Geospatial Big Data
    Amirian, Pouria
    Basiri, Anahid
    Winstanley, Adam
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2014, PT V, 2014, 8583 : 678 - +
  • [25] An Experimental Comparison of Complex Object Implementations for Big Data Systems
    Sikdar, Sourav
    Teymourian, Kia
    Jermaine, Chris
    PROCEEDINGS OF THE 2017 SYMPOSIUM ON CLOUD COMPUTING (SOCC '17), 2017, : 432 - 444
  • [26] Assessing Big Data SQL Frameworks for Analyzing Event Logs
    Hinkka, Markku
    Lehto, Teemu
    Heljanko, Keijo
    2016 24TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP), 2016, : 101 - 108
  • [27] Big Data Analytics: Exploring Graphs with Optimized SQL Queries
    Al-Amin, Sikder Tahsin
    Ordonez, Carlos
    Bellatreche, Ladjel
    DATABASE AND EXPERT SYSTEMS APPLICATIONS: DEXA 2018 INTERNATIONAL WORKSHOPS, 2018, 903 : 88 - 100
  • [28] Experimental Evaluation of Virtual Pottery Systems
    Dashti, Sarah
    Navarro-Newball, Aa
    Prakash, Edmond
    Hussain, Fiaz
    Carroll, Fiona
    2021 INTERNATIONAL CONFERENCE ON CYBERWORLDS (CW 2021), 2021, : 25 - 32
  • [29] EXPERIMENTAL EVALUATION OF GEOCHEMISTRY OF GEOTHERMAL SYSTEMS
    DOWNS, WF
    BARNES, HL
    JOURNAL OF THE ELECTROCHEMICAL SOCIETY, 1976, 123 (08) : C248 - C248
  • [30] EXPERIMENTAL EVALUATION OF DIVERSITY RECEIVING SYSTEMS
    GLASER, JL
    VANWAMBECK, SH
    PROCEEDINGS OF THE INSTITUTE OF RADIO ENGINEERS, 1950, 38 (02): : 209 - 210