ATCS: Auto-Tuning Configurations of Big Data Frameworks Based on Generative Adversarial Nets

被引:18
|
作者
Li, Mingyu [1 ,2 ]
Liu, Zhiqiang [1 ]
Shi, Xuanhua [1 ]
Jin, Hai [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, Natl Engn Res Ctr Big Data Technol & Syst, Serv Comp Technol & Syst Lab,Cluster & Grid Comp, Wuhan 430074, Peoples R China
[2] Liupanshui Normal Univ, Sch Math & Comp Sci, Liupanshui 553004, Peoples R China
关键词
Big data; generative adversarial nets; spark; genetic algorithm; automatic tune parameters; OPTIMIZATION; ALGORITHM;
D O I
10.1109/ACCESS.2020.2979812
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Big data processing frameworks (e.g., Spark, Storm) have been extensively used for massive data processing in the industry. To improve the performance and robustness of these frameworks, developers provide users with highly-configurable parameters. Due to the high-dimensional parameter space and complicated interactions of parameters, manual tuning of parameters is time-consuming and ineffective. Building performance-predicting models for big data frameworks is challenging for several reasons: (1) the significant time required to collect training data and (2) the poor accuracy of the prediction model when training data are limited. To meet this challenge, we proposes an auto-tuning configuration parameters system (ATCS), a new auto-tuning approach based on Generative Adversarial Nets (GAN). ATCS can build a performance prediction model with less training data and without sacrificing model accuracy. Moreover, an optimized Genetic Algorithm (GA) is used in ATCS to explore the parameter space for optimum solutions. To prove the effectiveness of ATCS, we select five frequently-used workloads in Spark, each of which runs on five different sized data sets. The results demonstrate that ATCS improves the performance of five frequently-used Spark workloads compared to the default configurations. We achieved a performance increase of 3.5x on average, with a maximum of 6.9x. To obtain similar model accuracy, experiment results also demonstrate that the quantity of ATCS training data is only 6% of Deep Neural Network (DNN) data, 13% of Support Vector Machine (SVM) data, 18% of Decision Tree (DT) data. Moreover, compared to other machine learning models, the average performance increase of ATCS is 1.7x that of DNN, 1.6x that of SVM, 1.7x that of DT on the five typical Spark programs.
引用
收藏
页码:50485 / 50496
页数:12
相关论文
共 50 条
  • [21] GAIN: Missing Data Imputation using Generative Adversarial Nets
    Yoon, Jinsung
    Jordon, James
    van der Schaar, Mihaela
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [22] Generative Adversarial Networks for Increasing the Veracity of Big Data
    Dering, Matthew L.
    Tucker, Conrad S.
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 2595 - 2602
  • [23] Deep Learning Based Lung Region Segmentation with Data Preprocessing by Generative Adversarial Nets
    Nitta, Jumpei
    Nakao, Megumi
    Imanishi, Keiho
    Matsuda, Tetsuya
    42ND ANNUAL INTERNATIONAL CONFERENCES OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY: ENABLING INNOVATIVE TECHNOLOGIES FOR GLOBAL HEALTHCARE EMBC'20, 2020, : 1278 - 1281
  • [24] Content-Aware Traffic Data Completion in ITS Based on Generative Adversarial Nets
    Han, Lingyi
    Zheng, Kan
    Zhao, Long
    Wang, Xianbin
    Wen, Huimin
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2020, 69 (10) : 11950 - 11962
  • [25] Research on Denoising Technology of Generative Adversarial Networks (GAN) Based on Big Data
    Feng Xiancheng
    Zhang Xinyu
    Qiu Wu
    MIPPR 2019: PARALLEL PROCESSING OF IMAGES AND OPTIMIZATION TECHNIQUES; AND MEDICAL IMAGING, 2020, 11431
  • [26] A sampling-based approach for communication libraries auto-tuning
    Brunet, Elisabeth
    Trahay, Francois
    Denis, Alexandre
    Namyst, Raymond
    2011 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2011, : 299 - 307
  • [27] Towards Machine Learning-Based Auto-tuning of MapReduce
    Yigitbasi, Nezih
    Willke, Theodore L.
    Liao, Guangdeng
    Epema, Dick
    2013 IEEE 21ST INTERNATIONAL SYMPOSIUM ON MODELING, ANALYSIS & SIMULATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS (MASCOTS 2013), 2013, : 11 - +
  • [28] Design for Auto-tuning PID Controller Based on Genetic Algorithms
    Fan, Liu
    Joo, Er Meng
    ICIEA: 2009 4TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, VOLS 1-6, 2009, : 1915 - 1919
  • [29] Auto-tuning of PID parameters based on switch step response
    Yang, Z
    Wang, JL
    1997 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT PROCESSING SYSTEMS, VOLS 1 & 2, 1997, : 779 - 782
  • [30] Auto-tuning of feedforward friction compensation based on the gradient method
    Altpeter, F
    Grunenberg, M
    Myszkorowski, P
    Longchamp, R
    PROCEEDINGS OF THE 2000 AMERICAN CONTROL CONFERENCE, VOLS 1-6, 2000, : 2600 - 2604