Analyzing performance of Apache Tez and MapReduce with hadoop multinode cluster on Amazon cloud

被引:12
|
作者
Singh R. [1 ]
Kaur P.J. [1 ]
机构
[1] Department of I.T, U.I.E.T, Panjab University, Chandigarh
关键词
Apache Hive; Apache Pig; Apache Tez; Big Data; Hadoop; HDFS; MapReduce;
D O I
10.1186/s40537-016-0051-6
中图分类号
学科分类号
摘要
Big Data is the term used for larger data sets that are very complex and not easily processed by the traditional devices. Today is the need of the new technology for processing these large data sets. Apache Hadoop is the good option and it has many components that worked together to make the hadoop ecosystem robust and efficient. Apache Pig is the core component of hadoop ecosystem and it accepts the tasks in the form of scripts. To run these scripts Apache Pig may use MapReduce or Apache Tez framework. In our previous paper we analyze how these two frameworks different from each other on the basis of some parameters chosen. We compare both the frameworks in theoretical and empirical way on the single node cluster. Here, in this paper we try to perform the analysis on multinode cluster which is installed at Amazon cloud. © 2016, The Author(s).
引用
收藏
相关论文
共 28 条
  • [21] Analyzing Web Application Log Files to Find Hit Count Through the Utilization of Hadoop MapReduce in Cloud Computing Environment
    Narkhede, Sayalee
    Baraskar, Trupti
    Mukhopadhyay, Debajyoti
    [J]. 2014 CONFERENCE ON IT IN BUSINESS, INDUSTRY AND GOVERNMENT (CSIBIG), 2014,
  • [22] Impact of Processing and Analyzing Healthcare Big Data on Cloud Computing Environment by Implementing Hadoop Cluster
    Rallapalli, Sreekanth
    Gondkar, R. R.
    Ketavarapu, Uma Pavan Kumar
    [J]. INTERNATIONAL CONFERENCE ON COMPUTATIONAL MODELLING AND SECURITY (CMS 2016), 2016, 85 : 16 - 22
  • [23] Performance of a Low Cost Hadoop Cluster for Image Analysis in Cloud Robotics Environment
    Qureshi, Basit
    Javed, Yasir
    Koubaa, Anis
    Sriti, Mohamed-Foued
    Alajlan, Maram
    [J]. 4TH SYMPOSIUM ON DATA MINING APPLICATIONS (SDMA2016), 2016, 82 : 90 - 98
  • [24] A Self-Tuning System based on Application Profiling and Performance Analysis for Optimizing Hadoop MapReduce Cluster Configuration
    Wu, Dili
    Gokhale, Aniruddha
    [J]. 2013 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2013, : 89 - 98
  • [25] Cloud-POA: A Cloud-Based Map Only Implementation of PO-MSA on Amazon Multi-node EC2 Hadoop Cluster
    Neehal, Nafis
    Karim, Dewan Ziaul
    Islam, Ashraful
    [J]. 2017 20TH INTERNATIONAL CONFERENCE OF COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2017,
  • [26] Towards optimal resource provisioning for Hadoop-MapReduce jobs using scale-out strategy and its performance analysis in private cloud environment
    Ramakrishnan Ramanathan
    B. Latha
    [J]. Cluster Computing, 2019, 22 : 14061 - 14071
  • [27] Towards optimal resource provisioning for Hadoop-MapReduce jobs using scale-out strategy and its performance analysis in private cloud environment
    Ramanathan, Ramakrishnan
    Latha, B.
    [J]. CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (Suppl 6): : 14061 - 14071
  • [28] Multi-Factor Performance Comparison of Amazon Web Services Elastic Compute Cluster and Google Cloud Platform Compute Engine
    Ahuja, Sanjay P.
    Czarnecki, Emily
    Willison, Sean
    [J]. INTERNATIONAL JOURNAL OF CLOUD APPLICATIONS AND COMPUTING, 2020, 10 (03) : 1 - 16