Comparative Analysis of Apache Spark and Hadoop MapReduce Using Various Parameters and Execution Time

被引：1

作者：

Meena, Bhagavathula ^{[1
]}

Sarwani, I. S. L. ^{[2
]}

Archana, M. ^{[3
]}

Supriya, P. ^{[4
]}

机构：

[1] Raghu Engn Coll, CSE Dept, Visakhapatnam, Andhra Pradesh, India

[2] ANITS, Visakhapatnam, Andhra Pradesh, India

[3] CVR Coll Engn, Hyderabad, Telangana, India

[4] Raghu Engn Coll, Visakhapatnam, Andhra Pradesh, India

来源：

INTELLIGENT COMPUTING AND COMMUNICATION, ICICC 2019 | 2020年 / 1034卷

关键词：

Hadoop; Apache Spark; Big Data; HDFS; MapReduce;

D O I：

10.1007/978-981-15-1084-7_70

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Due to rapid growth in Information technology there is a lot of advancement in Electronics and communication. Every hour lot of data with various medium is getting generated which is referred as big data. Big Data and Hadoop are the trending terms nowadays. Storage and analysis of such a large data is becoming one of the challenges for computer science and Information Technology devotee throughout the world in the most recent couple of the years. As Apache Spark and Hadoop are the frameworks used for analyzing big data, our paper discusses a comparison of both the frame works by choosing different sizes of datasets and in terms of time comparison also. This comparison is made using word count algorithm. Although both the resources are relayed on an idea of significantly varying Big Data performance. This paper shows an analysis on both frameworks for word count algorithm over Hadoop MapReduce and Apache spark environment

引用

页码：719 / 725

页数：7

共 50 条

[1] A comparative between Hadoop MapReduce and Apache Spark on HDFS
Saouabi, Mohamed
Ezzati, Abdellah
[J]. PROCEEDINGS OF THE 1ST INTERNATIONAL CONFERENCE ON INTERNET OF THINGS AND MACHINE LEARNING (IML'17), 2017,
[2] On the Usability of Hadoop MapReduce, Apache Spark & Apache Flink for Data Science
Akil, Bilal
Zhou, Ying
Roehm, Uwe
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 303 - 310
[3] Execution Time Prediction for Apache Spark
Gao, Zhipeng
Wang, Ting
Wang, Qian
Yang, Yang
[J]. 2018 INTERNATIONAL CONFERENCE ON COMPUTING AND BIG DATA (ICCBD 2018), 2018, : 47 - 51
[4] Performing Bayesian Inference using Apache Hadoop MapReduce
Jongsawat, Nipat
Premchaiswadi, Wichian
[J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND SOFTWARE ENGINEERING (AISE 2014), 2014, : 420 - 424
[5] Fast execution of RDF queries using Apache Hadoop
Mazumdar, Somnath
Scionti, Alberto
[J]. ADVANCES IN COMPUTERS, VOL 119, 2020, 119 : 1 - 33
[6] Query Execution Time Analysis Using Apache Spark Framework for Big Data: A CRM Approach
Yadav, Madan Lal
[J]. JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT, 2022, 21 (04)
[7] Query Execution Time Analysis Using Apache Spark Framework for Big Data: A CRM Approach
Yadav, Madan Lal
[J]. Journal of Information and Knowledge Management, 2022, 21 (04):
[8] A comprehensive performance analysis of Apache Hadoop and Apache Spark for large scale data sets using HiBench
N. Ahmed
Andre L. C. Barczak
Teo Susnjak
Mohammed A. Rashid
[J]. Journal of Big Data, 7
[9] A comprehensive performance analysis of Apache Hadoop and Apache Spark for large scale data sets using HiBench
Ahmed, N.
Barczak, Andre L. C.
Susnjak, Teo
Rashid, Mohammed A.
[J]. JOURNAL OF BIG DATA, 2020, 7 (01)
[10] Real-time Data Streaming using Apache Spark on Fully Configured Hadoop Cluster
Prasad, Kashi Sai
Pasupathy, S.
[J]. JOURNAL OF MECHANICS OF CONTINUA AND MATHEMATICAL SCIENCES, 2018, 13 (05): : 164 - 176

← 1 2 3 4 5 →