d-Simplexed: Adaptive Delaunay Triangulation or Performance Modeling and Prediction on Big Data Analytics

被引:14
|
作者
Chen, Yuxing [1 ]
Goetsch, Peter [1 ]
Hoque, Mohammad A. [1 ]
Lu, Jiaheng [1 ]
Tarkoma, Sasu [1 ]
机构
[1] Univ Helsinki, Dept Comp Sci, Helsinki 00560, Finland
基金
芬兰科学院;
关键词
Performance modeling; big data analytics; adaptive sampling; delaunay triangulation; MAPREDUCE;
D O I
10.1109/TBDATA.2019.2948338
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Big Data processing systems (e.g., Spark) have a number of resource configuration parameters, such as memory size, CPU allocation, and the number of running nodes. Regular users and even expert administrators struggle to understand the mutual relation between different parameter configurations and the overall performance of the system. In this paper, we address this challenge by proposing a performance prediction framework, called d-Simplexed, to build performance models with varied configurable parameters on Spark. We take inspiration from the field of Computational Geometry to construct a d-dimensional mesh using Delaunay Triangulation over a selected set of features. From this mesh, we predict execution time for various feature configurations. To minimize the time and resources in building a bootstrap model with a large number of configuration values, we propose an adaptive sampling technique to allow us to collect as few training points as required. Our evaluation on a cluster of computers using WordCount, PageRank, Kmeans, and Join workloads in HiBench benchmarking suites shows that we can achieve less than 5 percent error rate for estimation accuracy by sampling less than 1 percent of data.
引用
收藏
页码:458 / 469
页数:12
相关论文
共 50 条
  • [1] Delaunay Triangulation Data Augmentation guided by Visual Analytics for Deep Learning
    Peixinho, Alan Z.
    Benato, Barbara C.
    Nonato, Luis G.
    Falcao, Alexandre X.
    [J]. PROCEEDINGS 2018 31ST SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 2018, : 384 - 391
  • [2] Delaunay triangulation and 3D adaptive mesh generation
    Golias, NA
    Dutton, RW
    [J]. FINITE ELEMENTS IN ANALYSIS AND DESIGN, 1997, 25 (3-4) : 331 - 341
  • [3] An adaptive and rapid 3D Delaunay triangulation for randomly distributed point cloud data
    Tianyun Su
    Wen Wang
    Haixing Liu
    Zhendong Liu
    Xinfang Li
    Zhen Jia
    Lin Zhou
    Zhuanling Song
    Ming Ding
    Aiju Cui
    [J]. The Visual Computer, 2022, 38 : 197 - 221
  • [4] An adaptive and rapid 3D Delaunay triangulation for randomly distributed point cloud data
    Su, Tianyun
    Wang, Wen
    Liu, Haixing
    Liu, Zhendong
    Li, Xinfang
    Jia, Zhen
    Zhou, Lin
    Song, Zhuanling
    Ding, Ming
    Cui, Aiju
    [J]. VISUAL COMPUTER, 2022, 38 (01): : 197 - 221
  • [5] A Road Map Refinement Method Using Delaunay Triangulation for Big Trace Data
    Tang, Luliang
    Ren, Chang
    Liu, Zhang
    Li, Qingquan
    [J]. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2017, 6 (02)
  • [6] Big Data Analytics for Popularity Prediction
    Murthy, G. Vishnu
    SwathiReddy, M.
    Balakrishna, G.
    [J]. INTERNATIONAL CONFERENCE ON COMPUTER VISION AND MACHINE LEARNING, 2019, 1228
  • [7] Modeling and Optimization for Big Data Analytics
    Slavakis, Konstantinos
    Giannakis, Georgios B.
    Mateos, Gonzalo
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2014, 31 (05) : 18 - 31
  • [8] Adaptive Modeling for Real Time Analytics: The Case of "Big Data" in Mobile Advertising
    Kridel, Donald
    Dolk, Daniel
    Castillo, David
    [J]. 2015 48TH HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES (HICSS), 2015, : 887 - 896
  • [9] A Workflow Model for Adaptive Analytics on Big Data
    Kantere, Verena
    Filatov, Maxim
    [J]. 2015 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2015, 2015, : 673 - 676
  • [10] Gravity modeling and analyzing based on 3D Delaunay triangulation algorithm
    Li Zhen-Hai
    Luo Zhi-Cai
    Zhong Bo
    [J]. CHINESE JOURNAL OF GEOPHYSICS-CHINESE EDITION, 2012, 55 (07): : 2259 - 2267