DHive: Query Execution Performance Analysis via Dataflow in Apache Hive

被引:0
|
作者
Zhang, Chaozu [1 ]
Shen, Qiaomu [2 ]
Tang, Bo [1 ]
机构
[1] Southern Univ Sci & Technol, Dept Comp Sci & Engn, Shenzhen, Peoples R China
[2] Southern Univ Sci & Technol, Res Inst Trustworthy Autonomous Syst, Shenzhen, Peoples R China
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2023年 / 16卷 / 12期
关键词
D O I
10.14778/3611540.3611605
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays, Apache Hive has been widely used for large-scale data analysis applications in many organizations. Various visual analytical tools are developed to help Hive users quickly analyze the query execution process and identify the performance bottleneck of executed queries. However, existing tools mostly focus on showing the time usage of query sub-components (jobs and operators) but fail to provide enough evidence to analyze the root reasons for the slow execution progress. To tackle this problem, we develop a visual analytical system DHive to visualize and analyze the query execution progress via dataflow analysis. DHive shows the dataflow during query execution at multiple levels: query level, job level and task level, which enable users to identify the key jobs/tasks and explain their time usage by linking them to the auxiliary information such as the system configuration and hardware status. We demonstrate the effectiveness of DHive by two cases in a production cluster. DHive is open-source at https://github.com/DBGroupSUSTech/DHive.git.
引用
收藏
页码:3998 / 4001
页数:4
相关论文
共 50 条
  • [31] Predicting Query Performance via Classification
    Collins-Thompson, Kevyn
    Bennett, Paul N.
    ADVANCES IN INFORMATION RETRIEVAL, PROCEEDINGS, 2010, 5993 : 140 - 152
  • [32] Accelerating Range Query Execution of in-Memory Stores: A Performance Study
    Duc Hai Nguyen
    Van An Le
    Minh Thanh Chung
    Tran Vu Pham
    Nam Thoai
    PROCEEDINGS OF 2016 IEEE 18TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS; IEEE 14TH INTERNATIONAL CONFERENCE ON SMART CITY; IEEE 2ND INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2016, : 237 - 244
  • [33] Stethoscope: A Platform for Interactive Visual Analysis of Query Execution Plans
    Gawade, Mrunal
    Kersten, Martin
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (12): : 1926 - 1929
  • [34] Watermarks in Stream Processing Systems: Semantics and Comparative Analysis of Apache Flink and Google Cloud Dataflow
    Akidau, Tyler
    Begoli, Edmon
    Chernyak, Slava
    Hueske, Fabian
    Knight, Kathryn
    Knowles, Kenneth
    Mills, Daniel
    Sotolongo, Dan
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 14 (12): : 3135 - 3147
  • [35] Performance Estimation of High-Level Dataflow Program on Heterogeneous Platforms by Dynamic Network Execution
    Bloch, Aurelien
    Casale-Brunet, Simone
    Mattavelli, Marco
    JOURNAL OF LOW POWER ELECTRONICS AND APPLICATIONS, 2022, 12 (03)
  • [36] A task-uncoordinated distributed dataflow model for scalable high performance parallel program execution
    Wilson, Lucas A.
    von Ronne, Jeffery
    PARALLEL COMPUTING, 2016, 51 : 79 - 87
  • [37] Resource-efficient Shared Query Execution via Exploiting Time Slackness
    Tang, Dixin
    Shang, Zechao
    Ma, William W.
    Elmore, Aaron J.
    Krishnan, Sanjay
    SIGMOD '21: PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2021, : 1797 - 1810
  • [38] HyperFlow: an integrated visual query and dataflow language for end-user information analysis
    Dotan, D
    Pinter, RY
    2005 IEEE SYMPOSIUM ON VISUAL LANGUAGE AND HUMAN-CENTRIC COMPUTING, PROCEEDINGS, 2005, : 27 - 34
  • [39] Performance analysis of complex systems by integration of dataflow graphs and compositional performance analysis
    Schliecker, Simon
    Stein, Steffen
    Ernst, Rolf
    2007 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, VOLS 1-3, 2007, : 273 - 278
  • [40] Integrated High-Performance Platform for Fast Query Response in Big Data with Hive, Impala, and SparkSQL: A Performance Evaluation
    Chang, Bao Rong
    Tsai, Hsiu-Fen
    Lee, Yun-Da
    APPLIED SCIENCES-BASEL, 2018, 8 (09):