Enhancing Dataset Processing in Hadoop YARN Performance for Big Data Applications

被引:2
|
作者
Al-Absi, Ahmed Abdulhakim [1 ]
Kang, Dae-Ki [1 ]
Kim, Myong-Jong [2 ]
机构
[1] Dongseo Univ, Div Comp & Informat Engn, Busan, South Korea
[2] Pusan Natl Univ, Sch Business, 63 Beon Gil 2,Busandaehag Ro, Busan 609735, South Korea
关键词
Dataset; Hadoop YARN; MapReduce; Big data; Cloud computing;
D O I
10.1007/978-3-662-47895-0_2
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In Hadoop MapReduce distributed file system, as the input dataset files get loaded and split to every worker, workers start to do the required computation according to user logic. This process is done in parallel using all nodes in the cluster and computes output results. However, the contention of resources between the map and reduce stages cause significant delays in execution time, especially due to the memory IO overheads. This is undesired because the task execution in the Hadoop MapReduce induces an overhead in considering redundant data in case of imprecise applications which increases the execution time. Thus, in this paper we present our approach to optimize local worker memory management mechanism to reduce the presence of null schedule slots. Efficient utilization of slots leads to reduce execution times. The local memory management mechanism adopted enables efficient parallel execution and reduced memory overheads. The approach effectively reduced the MapReduce computation time which minimizes the budget for application execution in the cloud.
引用
收藏
页码:9 / 15
页数:7
相关论文
共 50 条
  • [1] Performance Modeling and Analysis of a Hadoop Cluster for Efficient Big Data Processing
    Lim, JongBeom
    Ahnh, Jong-Suk
    Lee, Kang-Woo
    [J]. ADVANCED SCIENCE LETTERS, 2016, 22 (09) : 2314 - 2319
  • [2] Cache Utilization for Enhancing Analyzation of Big-Data & Increasing the Performance of Hadoop
    Kanbargi, Sanjeev G.
    Kumar, Sunil S.
    [J]. 2015 INTERNATIONAL CONFERENCE ON TRENDS IN AUTOMATION, COMMUNICATIONS AND COMPUTING TECHNOLOGY (I-TACT-15), 2015,
  • [3] A Performance Analysis of MapReduce Applications on Big Data in Cloud based Hadoop
    Gohil, Parth
    Garg, Dweepna
    Panchal, Bakul
    [J]. 2014 INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND EMBEDDED SYSTEMS (ICICES), 2014,
  • [4] Architecture of Efficient Word Processing using Hadoop MapReduce for Big Data Applications
    Mandal, Bichitra
    Sahoo, Ramesh Kumar
    Sethi, Srinivas
    [J]. PROCEEDINGS 2015 INTERNATIONAL CONFERENCE ON MAN AND MACHINE INTERFACING (MAMI), 2015,
  • [5] Hadoop and Spark for Data Management, Processing and Analysis of Astronomical Big Data: Applicability and Performance
    Harischandra, Lloyd
    [J]. ASTRONOMICAL DATA ANALYSIS SOFTWARE AND SYSTEMS XXV, 2017, 512 : 41 - 44
  • [6] Efficient Big Data Processing in Hadoop MapReduce
    Dittrich, Jens
    Quiane-Ruiz, Jorge-Arnulfo
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (12): : 2014 - 2015
  • [7] SQL-On-Hadoop Systems: Evaluting Performance of Polybase for Big Data Processing
    Minukhin, Sergii
    Fedko, Victor
    Sitnikov, Dmytro
    [J]. 2018 INTERNATIONAL SCIENTIFIC-PRACTICAL CONFERENCE: PROBLEMS OF INFOCOMMUNICATIONS SCIENCE AND TECHNOLOGY (PIC S&T), 2018, : 591 - 594
  • [8] A Performance Analysis of MapReduce Task with Large Number of Files Dataset in Big Data Using Hadoop
    Pal, Amrit
    Agrawal, Pinki
    Jain, Kunal
    Agrawal, Sanjay
    [J]. 2014 FOURTH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT), 2014, : 587 - 591
  • [9] An overview of Hadoop applications in transportation big data
    Ma, Changxi
    Zhao, Mingxi
    Zhao, Yongpeng
    [J]. JOURNAL OF TRAFFIC AND TRANSPORTATION ENGINEERING-ENGLISH EDITION, 2023, 10 (05) : 900 - 917
  • [10] An overview of Hadoop applications in transportation big data
    Changxi Ma
    Mingxi Zhao
    Yongpeng Zhao
    [J]. Journal of Traffic and Transportation Engineering(English Edition), 2023, 10 (05) : 900 - 917