Big Data Analytics: Performance Evaluation for High Availability and Fault Tolerance using MapReduce Framework with HDFS

被引:0
|
作者
Verma, Jai Prakash [1 ]
Mankad, Sapan H. [1 ]
Garg, Sanjay [1 ]
机构
[1] Nirma Univ, Inst Technol, Ccomp Sci & Engn Dept, Ahmadabad, Gujarat, India
关键词
Big Data Analytics; Big Data; MapReduce; Hadoop Cluster; High Availability;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Big data analytics helps in analyzing structured data transaction and analytics programs that contain semi structured and unstructured data. Internet clickstream data, mobile-phone call details, server logs are examples of big data. Relational database-oriented dataset doesn't fit in traditional data warehouse since big data set is updated frequently and large amount of data are generated in real time. Many open source solutions are available for handling this large scale data. The Hadoop Distributed File System (HDFS) is one of the solutions which helps in storing, managing, and analyzing big data. Hadoop has become a standard for distributed storage and computing in Big Data Analytic applications. It has the capability to manage distributed nodes for data storage and processing in distributed manner. Hadoop architecture is also known as Store everything now and decide how to process later. Challenges and issues of multi-node Hadoop cluster setup and configuration are discussed in this paper. The troubleshooting for high availability of nodes in different scenarios for Hadoop cluster failure are experimented with different sizes of datasets. Experimental analysis carried out in this paper helps to improve uses of Hadoop cluster effectively for research and analysis. It also provides suggestions for selecting size of Hadoop cluster as per data size and generation speed.
引用
收藏
页码:770 / 775
页数:6
相关论文
共 50 条
  • [1] Big data analytics for retail industry using MapReduce-Apriori framework
    Verma, Neha
    Malhotra, Dheeraj
    Singh, Jatinder
    [J]. JOURNAL OF MANAGEMENT ANALYTICS, 2020, 7 (03) : 424 - 442
  • [2] A Distributed Framework for Predictive Analytics Using Big Data and MapReduce Parallel Programming
    Natesan, P.
    Sathishkumar, V.E.
    Mathivanan, Sandeep Kumar
    Venkatasen, Maheshwari
    Jayagopal, Prabhu
    Allayear, Shaikh Muhammad
    [J]. Mathematical Problems in Engineering, 2023, 2023
  • [3] Performance Evaluation of HDFS in Big Data Management
    Dev, Dipayan
    Patgiri, Ripon
    [J]. 2014 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND APPLICATIONS (ICHPCA), 2014,
  • [4] New approach in Big Data Mining for frequent itemset using mapreduce in HDFS
    Nikam, Pallavi V.
    Deshpande, Deepa S.
    [J]. 2018 3RD INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2018,
  • [5] AMPO: Algorithm for MapReduce Performance Optimization for Enhancing Big Data Analytics
    Yambem, Nandita
    Nandakumar, A. N.
    [J]. 2017 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, COMMUNICATION, COMPUTER, AND OPTIMIZATION TECHNIQUES (ICEECCOT), 2017, : 717 - 723
  • [6] Call Center Performance Evaluation Using Big Data Analytics
    Karakus, Betul
    Aydin, Galip
    [J]. 2016 INTERNATIONAL SYMPOSIUM ON NETWORKS, COMPUTERS AND COMMUNICATIONS (ISNCC), 2016,
  • [7] An intelligent approach to Big Data analytics for sustainable retail environment using Apriori-MapReduce framework
    Verma, Neha
    Singh, Jatinder
    [J]. INDUSTRIAL MANAGEMENT & DATA SYSTEMS, 2017, 117 (07) : 1503 - 1520
  • [8] Big Data Analysis Solutions using MapReduce Framework
    Elagib, Sara B.
    Najeeb, Atahur Rahman
    Hashim, Aisha H.
    Olanrewaju, Rashidah F.
    [J]. 2014 INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION ENGINEERING (ICCCE), 2014, : 127 - 130
  • [9] An Approach in Big Data Analytics to Improve the Velocity of Unstructured Data Using MapReduce
    Sundarakumar, M. R.
    Mahadevan, G.
    Somula, Ramasubbareddy
    Sennan, Sankar
    Rawal, Bharat S.
    [J]. INTERNATIONAL JOURNAL OF SYSTEM DYNAMICS APPLICATIONS, 2021, 10 (04)
  • [10] The framework of talent analytics using big data
    Saputra, Arnold
    Wang, Gunawan
    Zhang, Justin Zuopeng
    Behl, Abhishek
    [J]. TQM JOURNAL, 2022, 34 (01): : 178 - 198