Dynamic memory-aware scheduling in spark computing environment

被引:21
|
作者
Tang, Zhuo [1 ,2 ]
Zeng, Ailing [1 ]
Zhang, Xuedong [1 ]
Yang, Li [3 ]
Li, Kenli [1 ]
机构
[1] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410082, Hunan, Peoples R China
[2] Natl Univ Def Technol, Sci & Technol Parallel & Distributed Proc Lab, Changsha 410073, Hunan, Peoples R China
[3] Changsha Univ Sci & Technol, Coll Comp & Commun Engn, Changsha 410076, Hunan, Peoples R China
基金
中国国家自然科学基金;
关键词
Concurrency; Dynamic adjustment; Memory resource; Spark; Task scheduling; SPECULATIVE EXECUTION; MAPREDUCE;
D O I
10.1016/j.jpdc.2020.03.010
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Scheduling plays an important role in improving the performance of big data-parallel processing. Spark is an in-memory parallel computing framework that uses a multi-threaded model in task scheduling. Most Spark task scheduling processes do not take the memory into account, but the number of concurrent task threads determined by the user. It emerges as a potential limitation for the performance. To overcome the limitations in the Spark-core source code, this paper proposes a dynamic Spark memory-aware task scheduler (DMATS), which not only treats memory and network I/O as a computational resource but also dynamically adjusts concurrency when scheduling tasks. Specifically, we first analyze the RDD based Spark execution engine to obtain the amount of task processing data and propose an algorithm for estimating the initial adaptive task concurrency, which is integrated with the known task input information and the executor memory. Then, a dynamic adjustment algorithm is proposed to change the concurrency dynamically through feedback information to optimally utilize the limited memory resources. We implement a dynamic memory-aware task scheduling (DMATS) in Spark 2.3.4 and evaluate performance with two typical benchmarks, shuffle-light and shuffle-heavy. The results show that the algorithm not only reduces the execution time by 43.64%, but also significantly improves resource utilization. Experiments also show that our proposed method has advantages compared with other similar works such as WASP. (C) 2020 Elsevier Inc. All rights reserved.
引用
收藏
页码:10 / 22
页数:13
相关论文
共 50 条
  • [1] Dynamic Memory-Aware Task-Tree Scheduling
    Aupy, Guillaume
    Brasseur, Clement
    Marchal, Loris
    [J]. 2017 31ST IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2017, : 758 - 767
  • [2] A Memory-Aware Spark Cache Replacement Strategy
    Zhang, Jingyu
    Zhang, Ruihan
    Alfarraj, Osama
    Tolba, Amr
    Kim, Gwang-Jun
    [J]. JOURNAL OF INTERNET TECHNOLOGY, 2022, 23 (06): : 1185 - 1190
  • [3] Memory-aware feedback scheduling of control tasks
    Robertz, Sven Gestegard
    Henriksson, Dan
    Cervin, Anton
    [J]. 2006 IEEE CONFERENCE ON EMERGING TECHNOLOGIES & FACTORY AUTOMATION, VOLS 1 -3, 2006, : 577 - +
  • [4] Memory-aware list scheduling for hybrid platforms
    Herrmann, Julien
    Marchal, Loris
    Robert, Yves
    [J]. PROCEEDINGS OF 2014 IEEE INTERNATIONAL PARALLEL & DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2014, : 690 - 699
  • [5] A new memory monitoring scheme for memory-aware scheduling and partitioning
    Suh, GE
    Devadas, S
    Rudolph, L
    [J]. EIGHTH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, 2002, : 117 - 128
  • [6] Persistent Memory-Aware Scheduling for Serverless Workloads
    Samanta, Amit
    Ahmed, Faraz
    Cao, Lianjie
    Stutsman, Ryan
    Sharma, Puneet
    [J]. 2023 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, IPDPSW, 2023, : 615 - 621
  • [7] Exploration of memory-aware dynamic voltage scheduling for soft real-time applications
    Kim, YJ
    Kim, J
    [J]. 11TH IEEE INTERNATIONAL CONFERENCE ON EMBEDDED AND REAL-TIME COMPUTING SYSTEMS AND APPLICATIONS, PROCEEDINGS, 2005, : 177 - 180
  • [8] A constructive algorithm for memory-aware task assignment and scheduling
    Szymanek, R
    Kuchcinski, K
    [J]. PROCEEDINGS OF THE NINTH INTERNATIONAL SYMPOSIUM ON HARDWARE/SOFTWARE CODESIGN, 2001, : 147 - 152
  • [9] Memory-Aware Scheduling for Mixed-Criticality Systems
    Li, Zheng
    Wang, Li
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2016, PT II, 2016, 9787 : 140 - 156
  • [10] Memory-Aware Scheduling of Tasks Sharing Data on Multiple GPUs with Dynamic Runtime Systems
    Gonthier, Maxime
    Marchal, Loris
    Thibault, Samuel
    [J]. 2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2022), 2022, : 694 - 704