Improving MapReduce heterogeneous performance using KNN fair share scheduling

被引:1
|
作者
Kalia, Khushboo [1 ]
Dixit, Saurav [2 ,3 ]
Kumar, Kaushal [1 ]
Gera, Rajat [1 ]
Epifantsev, Kirill [4 ]
John, Vinod [5 ]
Taskaeva, Natalia [6 ]
机构
[1] KR Mangalam Univ, Sohna Rural, Haryana, India
[2] Peter Great St Petersburg Polytech Univ, Peter Great St, St Petersburg 195251, Russia
[3] Uttaranchal Univ, Div Res & Innovat, Dehra Dun 248007, Uttaranchal, India
[4] St Petersburg Univ Aerosp Instrumentat, St Petersburg 190000, Russia
[5] Amity Univ, Noida 201301, India
[6] Moscow State Univ Civil Engn, Natl Res Univ, Dept management & innovat, Yaroslavsko Shosse, Moscow 129337, Russia
关键词
Hadoop; MapReduce; Scheduling; Speculative prefetching And clustering;
D O I
10.1016/j.robot.2022.104228
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
MapReduce is one of the essential programming models for parallel processing and distributed storage of enormous data sets. The default Hadoop implementation assumes that the executing nodes are homogeneous. Data Locality is an important feature that Hadoop introduced to improve the performance of the traditional MapReduce model. The key idea is to move the map task closer to the node where the actual data resides rather than transferring the vast data set near the computation. Data Locality helps in lowering the network congestion and improving performance. However, this practice fails when processing the data in a heterogeneous Hadoop cluster. In a heterogeneous setup, nodes with different computational capabilities pose a crucial challenge. Nodes with a faster processing capacity finish the job compared to the nodes with slower processing ability. This paper proposes a KNN based scheduler that focuses on speculative prefetching and clustering of the data. The process starts with speculative prefetching and then performing the KNN clustering on the intermediate map output before directing it to the reducer for final processing. The performance evaluation of scheduler performance is analysed by executing different workloads like WordCount, RandomText, RandomNum, and Sort. The results show that the proposed idea improves the performance of job execution(C) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Improving the Performance of kNN in the MapReduce Framework Using Locality Sensitive Hashing
    Bagui, Sikha
    Mondal, Arup Kumar
    Bagui, Subhash
    [J]. INTERNATIONAL JOURNAL OF DISTRIBUTED SYSTEMS AND TECHNOLOGIES, 2019, 10 (04) : 1 - 16
  • [2] MrHeter: improving MapReduce performance in heterogeneous environments
    Zhang, Xiao
    Wu, Yanjun
    Zhao, Chen
    [J]. CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2016, 19 (04): : 1691 - 1701
  • [3] MrHeter: improving MapReduce performance in heterogeneous environments
    Xiao Zhang
    Yanjun Wu
    Chen Zhao
    [J]. Cluster Computing, 2016, 19 : 1691 - 1701
  • [4] Improving Fair Scheduling Performance on Hadoop
    Cheng, Ya-Wen
    Lo, Shou-Chih
    [J]. 2017 INTERNATIONAL CONFERENCE ON PLATFORM TECHNOLOGY AND SERVICE (PLATCON), 2017, : 1 - 6
  • [5] Performance Modeling of Systems using Fair Share Scheduling with Layered Queueing Networks
    Li, Lianhua
    Franks, Greg
    [J]. 2009 IEEE INTERNATIONAL SYMPOSIUM ON MODELING, ANALYSIS & SIMULATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS (MASCOTS), 2009, : 289 - 298
  • [6] Improving MapReduce Performance in a Heterogeneous Cloud: A Measurement Study
    Zhao, Xu
    Liu, Ling
    Zhang, Qi
    Dong, Xiaoshe
    [J]. 2014 IEEE 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2014, : 401 - 408
  • [7] Improving Encryption Performance using MapReduce
    Desai, Sanket
    Park, Younghee
    Gao, Jerry
    Chang, Sang-Yoon
    Song, Chungsik
    [J]. 2015 IEEE 17TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2015 IEEE 7TH INTERNATIONAL SYMPOSIUM ON CYBERSPACE SAFETY AND SECURITY, AND 2015 IEEE 12TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (ICESS), 2015, : 1350 - 1355
  • [8] Improving MapReduce Performance in Heterogeneous Environments with Adaptive Task Tuning
    Cheng, Dazhao
    Rao, Jia
    Guo, Yanfei
    Zhou, Xiaobo
    [J]. ACM/IFIP/USENIX MIDDLEWARE 2014, 2014, : 97 - 108
  • [9] Improving Performance of Heterogeneous MapReduce Clusters with Adaptive Task Tuning
    Cheng, Dazhao
    Rao, Jia
    Guo, Yanfei
    Jiang, Changjun
    Zhou, Xiaobo
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2017, 28 (03) : 774 - 786
  • [10] Improving MapReduce Performance by Data Prefetching in Heterogeneous or Shared Environments
    Gu, Tao
    Zuo, Chuang
    Liao, Qun
    Yang, Yulu
    Li, Tao
    [J]. INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 2013, 6 (05): : 71 - 81