Analyzing performance of Apache Tez and MapReduce with hadoop multinode cluster on Amazon cloud

被引:12
|
作者
Singh R. [1 ]
Kaur P.J. [1 ]
机构
[1] Department of I.T, U.I.E.T, Panjab University, Chandigarh
关键词
Apache Hive; Apache Pig; Apache Tez; Big Data; Hadoop; HDFS; MapReduce;
D O I
10.1186/s40537-016-0051-6
中图分类号
学科分类号
摘要
Big Data is the term used for larger data sets that are very complex and not easily processed by the traditional devices. Today is the need of the new technology for processing these large data sets. Apache Hadoop is the good option and it has many components that worked together to make the hadoop ecosystem robust and efficient. Apache Pig is the core component of hadoop ecosystem and it accepts the tasks in the form of scripts. To run these scripts Apache Pig may use MapReduce or Apache Tez framework. In our previous paper we analyze how these two frameworks different from each other on the basis of some parameters chosen. We compare both the frameworks in theoretical and empirical way on the single node cluster. Here, in this paper we try to perform the analysis on multinode cluster which is installed at Amazon cloud. © 2016, The Author(s).
引用
收藏
相关论文
共 28 条
  • [1] A Performance Comparison of Apache Tez and MapReduce with Data Compression on Hadoop Cluster
    Rattanaopas, Kritwara
    [J]. PROCEEDINGS OF 2017 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING (JCSSE), 2017,
  • [2] An Enhanced Parallelisation Model for Performance Prediction of Apache Spark on a Multinode Hadoop Cluster
    Ahmed, Nasim
    Barczak, Andre L. C.
    Rashid, Mohammad A.
    Susnjak, Teo
    [J]. BIG DATA AND COGNITIVE COMPUTING, 2021, 5 (04)
  • [3] Performance analysis of MapReduce Programs on Hadoop cluster
    Maurya, Mahesh
    Mahajan, Sunita
    [J]. PROCEEDINGS OF THE 2012 WORLD CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGIES, 2012, : 505 - 510
  • [4] Hadoop MapReduce Performance on SSDs for Analyzing Social Networks
    Bakratsas, M.
    Basaras, P.
    Katsaros, D.
    Tassiulas, L.
    [J]. BIG DATA RESEARCH, 2018, 11 : 1 - 10
  • [5] Performance Enhancement of Hadoop MapReduce Framework for Analyzing BigData
    Prabhu, Swathi
    Rodrigues, Anisha P.
    Prasad, Guru M. S.
    Nagesh, H. R.
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND COMMUNICATION TECHNOLOGIES, 2015,
  • [6] On The Performance of Apache Hadoop in a Tiny Private IaaS Cloud
    Loewen, Gabriel
    Galloway, Michael
    Vrbsky, Susan
    [J]. PROCEEDINGS OF THE 2013 10TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: NEW GENERATIONS, 2013, : 189 - 195
  • [7] Analyzing BigData with Hadoop Cluster in HDInsight Azure Cloud
    Bhardwaj, Aditya
    Singh, Vineet Kumar
    Choudhary, Vanraj
    Narayan, Yogendra
    [J]. 2015 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2015,
  • [8] An Open Source Project for Tuning and Analyzing MapReduce Performance in Hadoop and Spark
    Chen, Donghua
    Zhang, Runtong
    [J]. IEEE SOFTWARE, 2022, 39 (01) : 61 - 69
  • [9] Apache Hadoop Yarn MapReduce Job Classification Based on CPU Utilization and Performance Evaluation on Multi-cluster Heterogeneous Environment
    Mathiya, Bhavin J.
    Desai, Vinodkumar L.
    [J]. PROCEEDINGS OF INTERNATIONAL CONFERENCE ON ICT FOR SUSTAINABLE DEVELOPMENT, ICT4SD 2015, VOL 1, 2016, 408 : 35 - 44
  • [10] Performance Analysis of Hadoop MapReduce on an OpenNebula Cloud with KVM and OpenVZ Virtualizations
    Magalhaes Vasconcelos, Pedro Roger
    de Araujo Freitas, Gisele Azevedo
    [J]. 2014 9TH INTERNATIONAL CONFERENCE FOR INTERNET TECHNOLOGY AND SECURED TRANSACTIONS (ICITST), 2014, : 471 - 476