Comparative Analysis of Apache Spark and Hadoop MapReduce Using Various Parameters and Execution Time

被引:1
|
作者
Meena, Bhagavathula [1 ]
Sarwani, I. S. L. [2 ]
Archana, M. [3 ]
Supriya, P. [4 ]
机构
[1] Raghu Engn Coll, CSE Dept, Visakhapatnam, Andhra Pradesh, India
[2] ANITS, Visakhapatnam, Andhra Pradesh, India
[3] CVR Coll Engn, Hyderabad, Telangana, India
[4] Raghu Engn Coll, Visakhapatnam, Andhra Pradesh, India
关键词
Hadoop; Apache Spark; Big Data; HDFS; MapReduce;
D O I
10.1007/978-981-15-1084-7_70
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to rapid growth in Information technology there is a lot of advancement in Electronics and communication. Every hour lot of data with various medium is getting generated which is referred as big data. Big Data and Hadoop are the trending terms nowadays. Storage and analysis of such a large data is becoming one of the challenges for computer science and Information Technology devotee throughout the world in the most recent couple of the years. As Apache Spark and Hadoop are the frameworks used for analyzing big data, our paper discusses a comparison of both the frame works by choosing different sizes of datasets and in terms of time comparison also. This comparison is made using word count algorithm. Although both the resources are relayed on an idea of significantly varying Big Data performance. This paper shows an analysis on both frameworks for word count algorithm over Hadoop MapReduce and Apache spark environment
引用
收藏
页码:719 / 725
页数:7
相关论文
共 50 条
  • [1] A comparative between Hadoop MapReduce and Apache Spark on HDFS
    Saouabi, Mohamed
    Ezzati, Abdellah
    [J]. PROCEEDINGS OF THE 1ST INTERNATIONAL CONFERENCE ON INTERNET OF THINGS AND MACHINE LEARNING (IML'17), 2017,
  • [2] On the Usability of Hadoop MapReduce, Apache Spark & Apache Flink for Data Science
    Akil, Bilal
    Zhou, Ying
    Roehm, Uwe
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 303 - 310
  • [3] Execution Time Prediction for Apache Spark
    Gao, Zhipeng
    Wang, Ting
    Wang, Qian
    Yang, Yang
    [J]. 2018 INTERNATIONAL CONFERENCE ON COMPUTING AND BIG DATA (ICCBD 2018), 2018, : 47 - 51
  • [4] Performing Bayesian Inference using Apache Hadoop MapReduce
    Jongsawat, Nipat
    Premchaiswadi, Wichian
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND SOFTWARE ENGINEERING (AISE 2014), 2014, : 420 - 424
  • [5] Fast execution of RDF queries using Apache Hadoop
    Mazumdar, Somnath
    Scionti, Alberto
    [J]. ADVANCES IN COMPUTERS, VOL 119, 2020, 119 : 1 - 33
  • [6] Query Execution Time Analysis Using Apache Spark Framework for Big Data: A CRM Approach
    Yadav, Madan Lal
    [J]. JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT, 2022, 21 (04)
  • [7] Query Execution Time Analysis Using Apache Spark Framework for Big Data: A CRM Approach
    Yadav, Madan Lal
    [J]. Journal of Information and Knowledge Management, 2022, 21 (04):
  • [8] A comprehensive performance analysis of Apache Hadoop and Apache Spark for large scale data sets using HiBench
    N. Ahmed
    Andre L. C. Barczak
    Teo Susnjak
    Mohammed A. Rashid
    [J]. Journal of Big Data, 7
  • [9] A comprehensive performance analysis of Apache Hadoop and Apache Spark for large scale data sets using HiBench
    Ahmed, N.
    Barczak, Andre L. C.
    Susnjak, Teo
    Rashid, Mohammed A.
    [J]. JOURNAL OF BIG DATA, 2020, 7 (01)
  • [10] Real-time Data Streaming using Apache Spark on Fully Configured Hadoop Cluster
    Prasad, Kashi Sai
    Pasupathy, S.
    [J]. JOURNAL OF MECHANICS OF CONTINUA AND MATHEMATICAL SCIENCES, 2018, 13 (05): : 164 - 176