Improving Scheduling Efficiency of Hadoop YARN Using AFSA Algorithm

被引:0
|
作者
Gao Junlei [1 ]
Tang Tie [1 ]
Wu Gangshan [1 ]
机构
[1] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing, Jiangsu, Peoples R China
关键词
YARN; Scheduler; ASAF; Hadoop;
D O I
10.1109/ISPA/IUCC.2017.00141
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Apache Hadoop is one of the most popular MapReduce framework for parallel processing of large data sets. As the job scheduler and resource manager, YARN plays a very important role. Schedulers on YARN are designed to minimize the makespan of MapReduce jobs. The performance of a scheduler in YARN depends not only on whether the resource capacity of the working nodes are fully utilized, but also on the dependencies among those tasks. Therefore it is very difficult to achieve an optimal solution. This paper proposes a new Hadoop YARN scheduling algorithm. The algorithm formalizes the problem as a multiple knapsack problem which takes into consideration of the resource cost and time cost of each task as well as the dependency between different tasks. Artificial Fish Swarm Algorithm is adopted to solve the knapsack optimization problem. The algorithm was implemented as a pluggable scheduler on the most recent version of Hadoop YARN and evaluated with several MapReduce benchmarks. The experimental results show that our scheduler could effectively reduce the makespan of Hadoop jobs by 30% compared with some existing scheduling policies.
引用
收藏
页码:919 / 924
页数:6
相关论文
共 50 条
  • [1] Improving the efficiency of MapReduce scheduling algorithm in Hadoop
    Thangaselvi, R.
    Ananthbabu, S.
    Jagadeesh, S.
    Aruna, R.
    [J]. PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON APPLIED AND THEORETICAL COMPUTING AND COMMUNICATION TECHNOLOGY (ICATCCT), 2015, : 63 - 68
  • [2] New Scheduling Algorithms for Improving Performance and Resource Utilization in Hadoop YARN Clusters
    Yao, Yi
    Gao, Han
    Wang, Jiayin
    Sheng, Bo
    Mi, Ningfang
    [J]. IEEE TRANSACTIONS ON CLOUD COMPUTING, 2021, 9 (03) : 1158 - 1171
  • [3] On MapReduce Scheduling in Hadoop Yarn on Heterogeneous Clusters
    Wang, Meng
    Wu, Chase Q.
    Cao, Huiyan
    Liu, Yang
    Wang, Yonggiang
    Hou, Aiqin
    [J]. 2018 17TH IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (IEEE TRUSTCOM) / 12TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA SCIENCE AND ENGINEERING (IEEE BIGDATASE), 2018, : 1747 - 1754
  • [4] Efficient vCore Based Container Deployment Algorithm for Improving Heterogeneous Hadoop YARN Performance
    Lee, SooKyung
    Bae, Min-Ho
    Eum, Jun-Ho
    Oh, Sangyoon
    [J]. INFORMATION SCIENCE AND APPLICATIONS 2017, ICISA 2017, 2017, 424 : 191 - 201
  • [5] Performance evaluation of fair and capacity scheduling in Hadoop YARN
    Sharma, Garima
    Ganpati, Anita
    [J]. 2015 INTERNATIONAL CONFERENCE ON GREEN COMPUTING AND INTERNET OF THINGS (ICGCIOT), 2015, : 904 - 906
  • [6] Improving Hadoop Performance Using Yarn-Based Architecture with Weather Datasets
    Kanwar, Kushal
    Shrivastava, Vishal
    [J]. 2018 INTERNATIONAL CONFERENCE ON AUTOMATION AND COMPUTATIONAL ENGINEERING (ICACE), 2018, : 178 - 186
  • [7] Hadoop Task Scheduling - Improving Algorithms using Tabular Approach
    Maheshwari, Abhishek
    Bhardwaj, Aakash
    Chandrasekaran, K.
    [J]. 2015 FIFTH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT2015), 2015, : 1034 - 1038
  • [8] Improving Fair Scheduling Performance on Hadoop
    Cheng, Ya-Wen
    Lo, Shou-Chih
    [J]. 2017 INTERNATIONAL CONFERENCE ON PLATFORM TECHNOLOGY AND SERVICE (PLATCON), 2017, : 1 - 6
  • [9] The study of grid task scheduling based on AFSA algorithm
    Chen, Ke
    Li, San-Si
    [J]. INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN TECHNOLOGY, 2012, 44 (02) : 145 - 151
  • [10] Job Scheduling Optimization using BAT Algorithm in Hadoop Environment
    Raghav, R. S.
    Amudhavel, J.
    Dhavachelvan, P.
    [J]. BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS, 2018, 11 (01): : 134 - 139