Energy-Efficient Task Scheduling for CPU-Intensive Streaming Jobs on Hadoop

被引:20
|
作者
Jin, Peiquan [1 ,2 ]
Hao, Xingjun [1 ]
Wang, Xiaoliang [1 ]
Yue, Lihua [1 ,2 ]
机构
[1] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei 230027, Anhui, Peoples R China
[2] Chinese Acad Sci, Key Lab Electromagnet Space Informat, Hefei 230027, Anhui, Peoples R China
基金
美国国家科学基金会;
关键词
Energy efficiency; scheduling algorithms; Hadoop; YARN; MAPREDUCE; SERVERS; POWER;
D O I
10.1109/TPDS.2018.2881176
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Hadoop, especially Hadoop 2.0, has been a dominant framework for real-time big data processing. However, Hadoop is not optimized for energy efficiency. Aiming to solve this problem, in this paper, we propose a new framework to improve the energy efficiency of Hadoop 2.0. We focus on the resource manager in Hadoop 2.0, namely YARN, and propose energy-efficient task scheduling mechanisms on YARN. Particularly, we focus on CPU-intensive streaming jobs and classify streaming jobs into two types, namely batch streaming jobs ( i.e., a set of jobs are submitted simultaneously) and online streaming jobs ( i.e., jobs are continuously submitted one by one). We devise different energy-efficient task scheduling algorithms for each kind of streaming jobs. Specially, we first propose to abstractly model performance and energy consumption by considering the characteristics of tasks as well as the computational resources in YARN. Based on this model, we study the energy efficiency of streaming tasks which consist of the performance model and energy consumption model of task. We propose two key principles for improving energy efficiency: 1) CPU usage aware task allocation, partitions tasks to NMs based on the task characteristic in term of CPU usage; and 2) resource efficient task allocation, reduce idle resource. Then, we propose a D-based binning algorithm for the batch task scheduling and K-based binning algorithm for the online task scheduling that can adapt to continuously arriving tasks. We conduct extensive experiments on a real Hadoop 2.0 cluster and use two kinds of workloads to evaluate the performance and energy efficiency of our proposal. Compared with Storm ( the streaming data processing tool in Hadoop 2.0) and other approaches including TAPA and DVFS-MR, our proposal is more energy efficient. The batch task scheduling algorithm reduces up to 10 percent of energy consumption and keeps comparable performance. In addition, the online task scheduling algorithm reduces up to 7 percent over the existing algorithms.
引用
收藏
页码:1298 / 1311
页数:14
相关论文
共 50 条
  • [1] Energy-efficient job stealing for CPU-intensive processing in mobile devices
    Juan Manuel Rodriguez
    Cristian Mateos
    Alejandro Zunino
    [J]. Computing, 2014, 96 : 87 - 117
  • [2] Energy-efficient job stealing for CPU-intensive processing in mobile devices
    Manuel Rodriguez, Juan
    Mateos, Cristian
    Zunino, Alejandro
    [J]. COMPUTING, 2014, 96 (02) : 87 - 117
  • [3] A Two-Phase Energy-Aware Scheduling Approach for CPU-Intensive Jobs in Mobile Grids
    Hirsch, Matias
    Manuel Rodriguez, Juan
    Mateos, Cristian
    Zunino, Alejandro
    [J]. JOURNAL OF GRID COMPUTING, 2017, 15 (01) : 55 - 80
  • [4] A Two-Phase Energy-Aware Scheduling Approach for CPU-Intensive Jobs in Mobile Grids
    Matías Hirsch
    Juan Manuel Rodríguez
    Cristian Mateos
    Alejandro Zunino
    [J]. Journal of Grid Computing, 2017, 15 : 55 - 80
  • [5] Efficient data and CPU-intensive job scheduling algorithms for healthcare cloud
    Sahoo, Prasan Kumar
    Dehury, Chinmaya Kumar
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2018, 68 : 119 - 139
  • [6] Energy-efficient task scheduling and consolidation algorithm for workflow jobs in cloud
    Khaleel, Mustafa
    Zhu, Michelle M.
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2016, 13 (03) : 268 - 284
  • [7] Energy-efficient CPU scheduling for multimedia applications
    Yuan, Wanghong
    Nahrstedt, Klara
    [J]. ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2006, 24 (03): : 292 - 331
  • [8] Approximation algorithms for energy-efficient scheduling of parallel jobs
    Kononov, Alexander
    Kovalenko, Yulia
    [J]. JOURNAL OF SCHEDULING, 2020, 23 (06) : 693 - 709
  • [9] Approximation algorithms for energy-efficient scheduling of parallel jobs
    Alexander Kononov
    Yulia Kovalenko
    [J]. Journal of Scheduling, 2020, 23 : 693 - 709
  • [10] Dynamic energy-efficient scheduling for streaming applications in storm
    Hongjian Li
    Hongxi Dai
    Zengyan Liu
    Hao Fu
    Yang Zou
    [J]. Computing, 2022, 104 : 413 - 432