Optimizing Performance of Aggregate Query Processing with Histogram Data Structure

被引:0
|
作者
Liang Yong [1 ]
Mu Zhaonan [1 ]
机构
[1] Guizhou Univ Commerce, Network & Informat Ctr, Guiyang 550014, Guizhou, Peoples R China
关键词
Massive data; Approximate query processing; Histogram; Aggregate query; Performance optimization;
D O I
10.1007/978-3-030-19807-7_33
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In today's big data era, the capability of analyze massive data efficient and return the results within an short time limit is critical to decision making, thus many big data system proposed and various distributed and parallel processing techniques are heavily investigated. Among previous research, most of them are working on precise query processing, while approximate query processing (AQP) techniques which make interactive data exploration more efficiently and allows users to tradeoff between query accuracy and response time have not been investigate comprehensively. In this paper, we study the characteristics of aggregate query, a typical type of analytical query, and proposed an approximate query processing approach to optimize the execution of massive data based aggregate query with a histogram data structure. We implemented this approach into big data system Hive and compare it with Hive and AQP-enabled big data system BlinkDB, the experimental results verified that our approach is significantly fast than these existing systems in most scenarios.
引用
收藏
页码:342 / 350
页数:9
相关论文
共 50 条
  • [1] Aggregate Query Processing on Incomplete Data
    Zhang, Anzhen
    Wang, Jinbao
    Li, Jianzhong
    Gao, Hong
    [J]. WEB AND BIG DATA (APWEB-WAIM 2018), PT I, 2018, 10987 : 286 - 294
  • [2] Optimizing Skyline Query Processing in Incomplete Data
    Gulzar, Yonis
    Alwan, Ali A.
    Turaev, Sherzod
    [J]. IEEE ACCESS, 2019, 7 : 178121 - 178138
  • [3] A Histogram based Analytical Approximate Query Processing for Massive Data
    Wang, Yijun
    Wang, Hanhu
    Li, Hui
    [J]. INFORMATION TECHNOLOGY APPLICATIONS IN INDUSTRY II, PTS 1-4, 2013, 411-414 : 362 - 365
  • [4] Optimizing Probabilistic Query Processing on Continuous Uncertain Data
    Peng, Liping
    Diao, Yanlei
    Liu, Anna
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2011, 4 (11): : 1169 - 1180
  • [5] Optimizing the Performance of Data Warehouse by Query Cache Mechanism
    Ul Hassan, Ch Anwar
    Hammad, Muhammad
    Uddin, Mueen
    Iqbal, Jawaid
    Sahi, Jawad
    Hussain, Saddam
    Ullah, Syed Sajid
    [J]. IEEE ACCESS, 2022, 10 : 13472 - 13480
  • [6] Optimizing distributed Query Processing
    Roosta, SH
    [J]. PDPTA '05: PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS 1-3, 2005, : 869 - 875
  • [7] Aggregate Query Processing Algorithm on Incomplete Data Based on Denotational Semantics
    Zhang, An-Zhen
    Li, Jian-Zhong
    Gao, Hong
    [J]. Ruan Jian Xue Bao/Journal of Software, 2020, 31 (02): : 406 - 420
  • [8] Probabilistic Threshold Range Aggregate Query Processing over Uncertain Data
    Yang, Shuxiang
    Zhang, Wenjie
    Zhang, Ying
    Lin, Xuemin
    [J]. ADVANCES IN DATA AND WEB MANAGEMENT, PROCEEDINGS, 2009, 5446 : 51 - +
  • [9] Parallel processing of Multi-Join Expansion_Aggregate data cube query in high performance database systems
    Taniar, D
    Tan, RBN
    [J]. I-SPAN'02: INTERNATIONAL SYMPOSIUM ON PARALLEL ARCHITECTURES, ALGORITHMS AND NETWORKS, PROCEEDINGS, 2002, : 51 - 56
  • [10] Optimizing Communication for Multi-Join Query Processing in Cloud Data Warehouses
    Kurunji, Swathi
    Ge, Tingjian
    Fu, Xinwen
    Liu, Benyuan
    Chen, Cindy X.
    [J]. INTERNATIONAL JOURNAL OF GRID AND HIGH PERFORMANCE COMPUTING, 2013, 5 (04) : 113 - 130