A Survey on Data-driven Performance Tuning for Big Data Analytics Platforms

被引:8
|
作者
Costa, Rogerio Luis de C. [1 ,2 ]
Moreira, Jose [2 ,3 ]
Pintor, Paulo [3 ]
dos Santos, Veronica [4 ]
Lifschitz, Sergio [4 ]
机构
[1] Polytech Leiria, Comp Sci & Commun Res Ctr CIIC, P-2411901 Leiria, Portugal
[2] Univ Aveiro, Inst Elect & Informat Engn IEETA, P-3810193 Aveiro, Portugal
[3] Univ Aveiro, Dept Eletron Telecommun & Informat DETI, P-3810193 Aveiro, Portugal
[4] Pontificia Univ Catolica Rio de Janeiro PUC Rio, Dept Informat, BR-22451900 Rio De Janeiro, RJ, Brazil
关键词
Big data systems; Big data platforms; Performance tuning; Database systems; DATA SYSTEMS; ARCHITECTURE; FRAMEWORK; EFFICIENT; DESIGN; SPARK; CHALLENGES; MANAGEMENT; INTERNET; ENGINE;
D O I
10.1016/j.bdr.2021.100206
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many research works deal with big data platforms looking forward to data science and analytics. These are complex and usually distributed environments, composed of several systems and tools. As expected, there is a need for a closer look at performance issues. In this work, we review performance tuning strategies in the big data environment. We focus on data driven tuning techniques, discussing the use of database inspired approaches. Concerning big data and NoSQL stores, performance tuning issues are quite different from the so-called conventional systems. Many existing solutions are mostly ad-hoc activities that do not fit for multiple situations. But there are some categories of data-driven solutions that can be taken as guidelines and incorporated into generalpurpose auto-tuning modules for big data systems. We examine typical performance tuning actions, discussing available solutions to support some of the tuning process's primary activities. We also discuss recent implementations of data-driven performance tuning solutions for big data platforms. We propose an initial classification based on the domain state-ofthe-art and present selected tuning actions for large-scale data processing systems. Finally, we organized existing works towards self-tuning big data systems based on this classification and presented general and system-specific tuning recommendations. We found that most of the literature pieces evaluate the use of tuning actions at the physical design perspective, and there is a lack of self-tuning machine learning-based solutions for big data systems. (C) 2021 Elsevier Inc. All rights reserved.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] A survey on platforms for big data analytics
    Singh D.
    Reddy C.K.
    [J]. Journal of Big Data, 2 (1)
  • [2] Popular platforms for big data analytics: A survey
    Merrouchi, Mohamed
    Skittou, Mustapha
    Gadi, Taoufiq
    [J]. 2018 INTERNATIONAL CONFERENCE ON ELECTRONICS, CONTROL, OPTIMIZATION AND COMPUTER SCIENCE (ICECOCS), 2018,
  • [3] A Performance Study of Big Data Analytics Platforms
    Pirzadeh, Pouria
    Carey, Michael
    Westmann, Till
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 2911 - 2920
  • [4] Big Data Analytics in Education: A Data-Driven Literature Review
    Shabihi, Negar
    Kim, Mi Song
    [J]. IEEE 21ST INTERNATIONAL CONFERENCE ON ADVANCED LEARNING TECHNOLOGIES (ICALT 2021), 2021, : 154 - 156
  • [5] A Data-Driven Framework for Business Analytics in the Context of Big Data
    Lu, Jing
    [J]. NEW TRENDS IN DATABASES AND INFORMATION SYSTEMS, ADBIS 2018, 2018, 909 : 339 - 351
  • [6] Advanced data analytics for enhancing building performances: From data-driven to big data-driven approaches
    Cheng Fan
    Da Yan
    Fu Xiao
    Ao Li
    Jingjing An
    Xuyuan Kang
    [J]. Building Simulation, 2021, 14 : 3 - 24
  • [7] Advanced data analytics for enhancing building performances: From data-driven to big data-driven approaches
    Fan, Cheng
    Yan, Da
    Xiao, Fu
    Li, Ao
    An, Jingjing
    Kang, Xuyuan
    [J]. BUILDING SIMULATION, 2021, 14 (01) : 3 - 24
  • [8] Data-driven techniques for temperature data prediction: big data analytics approach
    Oloyede, Adamson
    Ozuomba, Simeon
    Asuquo, Philip
    Olatomiwa, Lanre
    Longe, Omowunmi Mary
    [J]. ENVIRONMENTAL MONITORING AND ASSESSMENT, 2023, 195 (02)
  • [9] Data-driven techniques for temperature data prediction: big data analytics approach
    Adamson Oloyede
    Simeon Ozuomba
    Philip Asuquo
    Lanre Olatomiwa
    Omowunmi Mary Longe
    [J]. Environmental Monitoring and Assessment, 2023, 195
  • [10] Big data analytics management capability and firm performance: the mediating role of data-driven culture
    Tugba Karaboga
    Cemal Zehir
    Ekrem Tatoglu
    H. Aykut Karaboga
    Abderaouf Bouguerra
    [J]. Review of Managerial Science, 2023, 17 : 2655 - 2684