A Survey on Data-driven Performance Tuning for Big Data Analytics Platforms

被引:8
|
作者
Costa, Rogerio Luis de C. [1 ,2 ]
Moreira, Jose [2 ,3 ]
Pintor, Paulo [3 ]
dos Santos, Veronica [4 ]
Lifschitz, Sergio [4 ]
机构
[1] Polytech Leiria, Comp Sci & Commun Res Ctr CIIC, P-2411901 Leiria, Portugal
[2] Univ Aveiro, Inst Elect & Informat Engn IEETA, P-3810193 Aveiro, Portugal
[3] Univ Aveiro, Dept Eletron Telecommun & Informat DETI, P-3810193 Aveiro, Portugal
[4] Pontificia Univ Catolica Rio de Janeiro PUC Rio, Dept Informat, BR-22451900 Rio De Janeiro, RJ, Brazil
关键词
Big data systems; Big data platforms; Performance tuning; Database systems; DATA SYSTEMS; ARCHITECTURE; FRAMEWORK; EFFICIENT; DESIGN; SPARK; CHALLENGES; MANAGEMENT; INTERNET; ENGINE;
D O I
10.1016/j.bdr.2021.100206
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many research works deal with big data platforms looking forward to data science and analytics. These are complex and usually distributed environments, composed of several systems and tools. As expected, there is a need for a closer look at performance issues. In this work, we review performance tuning strategies in the big data environment. We focus on data driven tuning techniques, discussing the use of database inspired approaches. Concerning big data and NoSQL stores, performance tuning issues are quite different from the so-called conventional systems. Many existing solutions are mostly ad-hoc activities that do not fit for multiple situations. But there are some categories of data-driven solutions that can be taken as guidelines and incorporated into generalpurpose auto-tuning modules for big data systems. We examine typical performance tuning actions, discussing available solutions to support some of the tuning process's primary activities. We also discuss recent implementations of data-driven performance tuning solutions for big data platforms. We propose an initial classification based on the domain state-ofthe-art and present selected tuning actions for large-scale data processing systems. Finally, we organized existing works towards self-tuning big data systems based on this classification and presented general and system-specific tuning recommendations. We found that most of the literature pieces evaluate the use of tuning actions at the physical design perspective, and there is a lack of self-tuning machine learning-based solutions for big data systems. (C) 2021 Elsevier Inc. All rights reserved.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] Big Data as the Big Game Changer Big Data-driven world needs Big Data-driven ideology
    Smorodin, Gennady
    Kolesnichenko, Olga
    [J]. 2015 9TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT), 2015, : 40 - 43
  • [22] Big data analytics and big data science: a survey
    Chen, Yong
    Chen, Hong
    Gorkhali, Anjee
    Lu, Yang
    Ma, Yiqian
    Li, Ling
    [J]. JOURNAL OF MANAGEMENT ANALYTICS, 2016, 3 (01) : 1 - 42
  • [23] Big data analytics capability and decision-making: The role of data-driven insight on circular economy performance
    Awan, Usama
    Shamim, Saqib
    Khan, Zaheer
    Zia, Najam Ul
    Shariq, Syed Muhammad
    Khan, Muhammad Naveed
    [J]. TECHNOLOGICAL FORECASTING AND SOCIAL CHANGE, 2021, 168
  • [24] Big data analytics: a survey
    Tsai C.-W.
    Lai C.-F.
    Chao H.-C.
    Vasilakos A.V.
    [J]. Journal of Big Data, 2 (1)
  • [25] Big Data Analytics in Healthcare: Data-Driven Methods for Typical Treatment Pattern Mining
    Chonghui Guo
    Jingfeng Chen
    [J]. Journal of Systems Science and Systems Engineering, 2019, 28 : 694 - 714
  • [26] Big Data Analytics Adoption in Manufacturing Companies: The Contingent Role of Data-Driven Culture
    Thanabalan, Priveena
    Vafaei-Zadeh, Ali
    Hanifah, Haniruzila
    Ramayah, T.
    [J]. INFORMATION SYSTEMS FRONTIERS, 2024,
  • [27] Big Data Analytics in Healthcare: Data-Driven Methods for Typical Treatment Pattern Mining
    Guo, Chonghui
    Chen, Jingfeng
    [J]. JOURNAL OF SYSTEMS SCIENCE AND SYSTEMS ENGINEERING, 2019, 28 (06) : 694 - 714
  • [28] Big data: Hell or heaven? Digital platforms and market power in the data-driven economy
    Nuccio, Massimiliano
    Guerzoni, Marco
    [J]. COMPETITION & CHANGE, 2019, 23 (03) : 312 - 328
  • [29] SMES IN DATA-DRIVEN ERA: THE ROLE OF BIG DATA TO FIRM PERFORMANCE
    Kopanakis, Ioannis
    Vassakis, Konstantinos
    Mastorakis, George
    [J]. INNOVATION, ENTREPRENEURSHIP AND DIGITAL ECOSYSTEMS, 2016, : 2031 - 2031
  • [30] Performance Evaluation of Data-driven Intelligent Algorithms for Big data Ecosystem
    Junaid, Muhammad
    Ali, Sajid
    Siddiqui, Isma Farah
    Nam, Choonsung
    Qureshi, Nawab Muhammad Faseeh
    Kim, Jaehyoun
    Shin, Dong Ryeol
    [J]. WIRELESS PERSONAL COMMUNICATIONS, 2022, 126 (03) : 2403 - 2423