Profiling-Based Big Data Workflow Optimization in a Cross-layer Coupled Design Framework

被引:0
|
作者
Ye, Qianwen [1 ]
Wu, Chase Q. [1 ]
Liu, Wuji [1 ]
Hou, Aiqin [2 ]
Shen, Wei [3 ]
机构
[1] New Jersey Inst Technol, Dept Comp Sci, Newark, NJ 07102 USA
[2] Northwest Univ, Sch Informat Sci & Technol, Xian 710127, Shaanxi, Peoples R China
[3] Zhejiang Sci Tech Univ, Sch Informat Sci & Technol, Hangzhou 310018, Zhejiang, Peoples R China
基金
美国国家科学基金会;
关键词
Big data workflows; performance optimization; workflow profiling; stochastic approximation; coupled design; SCIENTIFIC WORKFLOWS; DELAY MINIMIZATION; SPARK;
D O I
10.1007/978-3-030-60248-2_14
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Big data processing and analysis increasingly rely on workflow technologies for knowledge discovery and scientific innovation. The execution of big data workflows is now commonly supported on reliable and scalable data storage and computing platforms such as Hadoop. There are a variety of factors affecting workflow performance across multiple layers of big data systems, including the inherent properties (such as scale and topology) of the workflow, the parallel computing engine it runs on, the resource manager that orchestrates distributed resources, the file system that stores data, as well as the parameter setting of each layer. Optimizing workflow performance is challenging because the compound effects of the aforementioned layers are complex and opaque to end users. Generally, tuning their parameters requires an in-depth understanding of big data systems, and the default settings do not always yield optimal performance. We propose a profiling-based cross-layer coupled design framework to determine the best parameter setting for each layer in the entire technology stack to optimize workflow performance. To tackle the large parameter space, we reduce the number of experiments needed for profiling with two approaches: i) identify a subset of critical parameters with the most significant influence through feature selection; and ii) minimize the search process within the value range of each critical parameter using stochastic approximation. Experimental results show that the proposed optimization framework provides the most suitable parameter settings for a given workflow to achieve the best performance. This profiling-based method could be used by end users and service providers to configure and execute large-scale workflows in complex big data systems.
引用
收藏
页码:197 / 217
页数:21
相关论文
共 50 条
  • [1] Spectrum sensing design framework based on cross-layer optimization of detection efficiency
    Park, Jihoon
    Jain, Rajeev
    Cabric, Danijela
    2009 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, VOLS 1-8, 2009, : 2743 - 2748
  • [2] A functional component based framework for cross-layer design
    Slavik, Mike
    Mahgoub, Imad
    Badi, Ahmed
    Sibai, Fadi N.
    2007 INNOVATIONS IN INFORMATION TECHNOLOGIES, VOLS 1 AND 2, 2007, : 660 - +
  • [3] Cross-layer Optimization of Big Data Transfer Throughput and Energy Consumption
    Di Tacchio, Luigi
    Nine, Md S. Q. Zulkar
    Kosar, Tevfik
    Bulut, Muhammed Fatih
    Hwang, Jinho
    2019 IEEE 12TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (IEEE CLOUD 2019), 2019, : 25 - 32
  • [4] Unified Cross-Layer Framework: A Generic Platform for Cross-Layer Design Experimentation
    Kung, Lu-Chuan
    Hou, Jennifer
    2010 IEEE GLOBAL TELECOMMUNICATIONS CONFERENCE GLOBECOM 2010, 2010,
  • [5] A Cross-Layer Optimization Framework for Integrated Optical Switches in Data Centers
    Wang, Zhifei
    Yang, Peng
    Chang, Yi-Shing
    Xu, Jiang
    Chen, Xuanqi
    Wang, Zhehui
    Feng, Jun
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 39 (03) : 640 - 653
  • [6] A NEW THEORETIC FRAMEWORK FOR CROSS-LAYER OPTIMIZATION
    Fu, Fangwen
    van der Schaar, Mihaela
    2008 15TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-5, 2008, : 3080 - 3083
  • [7] AODV Protocol Optimization Based on Cross-Layer Design of WMN
    Shi, Jihong
    Li, Haiyan
    Yu, Jiang
    Zong, Rong
    2011 7TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING (WICOM), 2011,
  • [8] Energy-saving Cross-layer Optimization of Big Data Transfer Based on Historical Log Analysis
    Rodolph, Lavone
    Nine, Md S. Q. Zulkar
    Di Tacchio, Luigi
    Kosar, Tevfik
    IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2021), 2021,
  • [9] A New Systematic Framework for Autonomous Cross-Layer Optimization
    Fu, Fangwen
    van der Schaar, Mihaela
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2009, 58 (04) : 1887 - 1903
  • [10] A Cross-Layer Memory Tracing Toolkit for Big Data Application Based on Spark
    Xu D.
    Wang J.
    Wang L.
    Zhang W.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2020, 57 (06): : 1179 - 1190