Enhancing Dataset Processing in Hadoop YARN Performance for Big Data Applications

被引：2

作者：

Al-Absi, Ahmed Abdulhakim ^{[1
]}

Kang, Dae-Ki ^{[1
]}

Kim, Myong-Jong ^{[2
]}

机构：

[1] Dongseo Univ, Div Comp & Informat Engn, Busan, South Korea

[2] Pusan Natl Univ, Sch Business, 63 Beon Gil 2,Busandaehag Ro, Busan 609735, South Korea

来源：

ADVANCED MULTIMEDIA AND UBIQUITOUS ENGINEERING: FUTURE INFORMATION TECHNOLOGY, VOL 2 | 2016年 / 354卷

关键词：

Dataset; Hadoop YARN; MapReduce; Big data; Cloud computing;

D O I：

10.1007/978-3-662-47895-0_2

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

In Hadoop MapReduce distributed file system, as the input dataset files get loaded and split to every worker, workers start to do the required computation according to user logic. This process is done in parallel using all nodes in the cluster and computes output results. However, the contention of resources between the map and reduce stages cause significant delays in execution time, especially due to the memory IO overheads. This is undesired because the task execution in the Hadoop MapReduce induces an overhead in considering redundant data in case of imprecise applications which increases the execution time. Thus, in this paper we present our approach to optimize local worker memory management mechanism to reduce the presence of null schedule slots. Efficient utilization of slots leads to reduce execution times. The local memory management mechanism adopted enables efficient parallel execution and reduced memory overheads. The approach effectively reduced the MapReduce computation time which minimizes the budget for application execution in the cloud.

引用

页码：9 / 15

页数：7

共 50 条

[1] Performance Modeling and Analysis of a Hadoop Cluster for Efficient Big Data Processing
Lim, JongBeom
Ahnh, Jong-Suk
Lee, Kang-Woo
[J]. ADVANCED SCIENCE LETTERS, 2016, 22 (09) : 2314 - 2319
[2] Cache Utilization for Enhancing Analyzation of Big-Data & Increasing the Performance of Hadoop
Kanbargi, Sanjeev G.
Kumar, Sunil S.
[J]. 2015 INTERNATIONAL CONFERENCE ON TRENDS IN AUTOMATION, COMMUNICATIONS AND COMPUTING TECHNOLOGY (I-TACT-15), 2015,
[3] A Performance Analysis of MapReduce Applications on Big Data in Cloud based Hadoop
Gohil, Parth
Garg, Dweepna
Panchal, Bakul
[J]. 2014 INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND EMBEDDED SYSTEMS (ICICES), 2014,
[4] Architecture of Efficient Word Processing using Hadoop MapReduce for Big Data Applications
Mandal, Bichitra
Sahoo, Ramesh Kumar
Sethi, Srinivas
[J]. PROCEEDINGS 2015 INTERNATIONAL CONFERENCE ON MAN AND MACHINE INTERFACING (MAMI), 2015,
[5] Hadoop and Spark for Data Management, Processing and Analysis of Astronomical Big Data: Applicability and Performance
Harischandra, Lloyd
[J]. ASTRONOMICAL DATA ANALYSIS SOFTWARE AND SYSTEMS XXV, 2017, 512 : 41 - 44
[6] Efficient Big Data Processing in Hadoop MapReduce
Dittrich, Jens
Quiane-Ruiz, Jorge-Arnulfo
[J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (12): : 2014 - 2015
[7] SQL-On-Hadoop Systems: Evaluting Performance of Polybase for Big Data Processing
Minukhin, Sergii
Fedko, Victor
Sitnikov, Dmytro
[J]. 2018 INTERNATIONAL SCIENTIFIC-PRACTICAL CONFERENCE: PROBLEMS OF INFOCOMMUNICATIONS SCIENCE AND TECHNOLOGY (PIC S&T), 2018, : 591 - 594
[8] A Performance Analysis of MapReduce Task with Large Number of Files Dataset in Big Data Using Hadoop
Pal, Amrit
Agrawal, Pinki
Jain, Kunal
Agrawal, Sanjay
[J]. 2014 FOURTH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT), 2014, : 587 - 591
[9] An overview of Hadoop applications in transportation big data
Ma, Changxi
Zhao, Mingxi
Zhao, Yongpeng
[J]. JOURNAL OF TRAFFIC AND TRANSPORTATION ENGINEERING-ENGLISH EDITION, 2023, 10 (05) : 900 - 917
[10] An overview of Hadoop applications in transportation big data
Changxi Ma
Mingxi Zhao
Yongpeng Zhao
[J]. Journal of Traffic and Transportation Engineering(English Edition), 2023, 10 (05) : 900 - 917

← 1 2 3 4 5 →