HaSTE: Hadoop YARN Scheduling Based on Task-Dependency and Resource-Demand

被引:36
|
作者
Yao, Yi [1 ]
Wang, Jiayin [2 ]
Sheng, Bo [2 ]
Lin, Jason [1 ]
Mi, Ningfang [1 ]
机构
[1] Northeeastern Univ, Dept Elect & Comp Engn, 360 Huntington Ave, Boston, MA 02115 USA
[2] Univ Massachusetts Boston, Dept Comp Sci, 100 Morrissey Blvd, Boston, MA 02125 USA
基金
美国国家科学基金会;
关键词
MAPREDUCE; CLASSIFICATION;
D O I
10.1109/CLOUD.2014.34
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The MapReduce framework has become the de facto scheme for scalable semi-structured and un-structured data processing in recent years. The Hadoop ecosystem has evolved into its second generation, Hadoop YARN, which adopts fine-rained resource management schemes for job scheduling. One of the primary performance concerns in YARN is how to minimize the total completion length, i.e., makespan, of a set of MapReduce jobs. However, the precedence constraint or fairness constraint in current widely used scheduling policies in YARN, such as FIFO and Fair, can both lead to inefficient resource allocation in the Hadoop YARN cluster. They also omit the dependency between tasks which is crucial for the efficiency of resource utilization. We thus propose a new YARN scheduler, named HaSTE, which can effectively reduce the makespan of MapReduce jobs in YARN by leveraging the information of requested resources, resource capacities, and dependency between tasks. We implemented HaSTE as a pluggable scheduler in the most recent version of Hadoop YARN, and evaluated it with classic MapReduce benchmarks. The experimental results demonstrate that our YARN scheduler effectively reduces the makespans and improves resource utilization compare to the current scheduling policies.
引用
收藏
页码:184 / 191
页数:8
相关论文
共 50 条
  • [1] New Scheduling Algorithms for Improving Performance and Resource Utilization in Hadoop YARN Clusters
    Yao, Yi
    Gao, Han
    Wang, Jiayin
    Sheng, Bo
    Mi, Ningfang
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2021, 9 (03) : 1158 - 1171
  • [2] Opportunistic Mobile Crowd Computing: Task-dependency Based Work-Stealing
    Nagesh, Sanjay Segu
    Fernando, Niroshinie
    Loke, Seng W.
    Neiat, AzadehGhari
    Pathirana, Pubudu N.
    PROCEEDINGS OF THE 2022 THE 28TH ANNUAL INTERNATIONAL CONFERENCE ON MOBILE COMPUTING AND NETWORKING, ACM MOBICOM 2022, 2022, : 775 - 777
  • [3] Residual Traffic Based Task Scheduling in Hadoop
    Tanaka, Daichi
    Kawarasaki, Masatoshi
    CLOUD COMPUTING 2015: THE SIXTH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, GRIDS, AND VIRTUALIZATION, 2015, : 94 - 102
  • [4] New Scheduling Algorithm in Hadoop Based on Resource Aware
    Xu, Peng
    Wang, Hong
    Tian, Ming
    PRACTICAL APPLICATIONS OF INTELLIGENT SYSTEMS, ISKE 2013, 2014, 279 : 1011 - 1020
  • [5] TaSRD: Task Scheduling Relying on Resource and Dependency in Mobile Edge Computing
    Cao, Yuting
    Chen, Haopeng
    Jiang, Jianwei
    Hu, Fei
    PROCEEDINGS OF THE 2018 IEEE INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATICS AND COMPUTING (PIC), 2018, : 287 - 295
  • [6] Evaluating Task Scheduling in Hadoop-based Cloud Systems
    Liu, Shengyuan
    Xu, Jungang
    Liu, Zongzhen
    Liu, Xu
    2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
  • [7] On Demand Resource Scheduler Based on Estimating Progress of Jobs in Hadoop
    Chen, Liangzhang
    Xu, Jie
    Li, Kai
    Lu, Zhonghao
    Qi, Qi
    Wang, Jingyu
    COLLABORATE COMPUTING: NETWORKING, APPLICATIONS AND WORKSHARING, COLLABORATECOM 2016, 2017, 201 : 615 - 626
  • [8] An emergency task scheduling method based on YARN capacity scheduler
    Yan, Jian
    Jia, Chen
    Yong, Yuan
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON ALGORITHMS, SOFTWARE ENGINEERING, AND NETWORK SECURITY, ASENS 2024, 2024, : 591 - 596
  • [9] RAS: A Task Scheduling Algorithm Based on Resource Attribute Selection in a Task Scheduling Framework
    Zhao, Yong
    Chen, Liang
    Li, Youfu
    Liu, Peng
    Li, Xiaolong
    Zhu, Chenchen
    INTERNET AND DISTRIBUTED COMPUTING SYSTEMS, IDCS 2013, 2013, 8223 : 106 - 119
  • [10] Resource Aware Scheduling in Hadoop for Heterogeneous Workloads based on Load Estimation
    Kapil, Sutariya B.
    Kamath, Sowmya S.
    2013 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATIONS AND NETWORKING TECHNOLOGIES (ICCCNT), 2013,