A Time Based Analysis of Data Processing on Hadoop Cluster

被引:0
|
作者
Pal, Amrit [1 ]
Agrawal, Sanjay [1 ]
机构
[1] Natl Inst Tech Teachers Training & Res, Dept Comp Engn & Applicat, Bhopal, India
关键词
MapReduce; Hadoop Distributed File System; Name Node; Data Node; Task Tracker; Job Tracker;
D O I
10.1109/CICN.2014.136
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data when it becomes in that much amount that it cannot be managed by the traditional database management system then it is Bigdata. It is difficult to manage this much amount of the data. Hadoop is a technological answer to the Big Data. Data storage and retrieval of information from the data is done by the Hadoop Distributed File System and the Map Reduce Programming model. MapReduce provides effective bench marks for retrieving the information from the Big Data. In this paper we present our experimental work done on the Hadoop Cluster. We have analyzed the time required by the cluster for processing the data with increasing number of nodes into the cluster. We started with a single node and then increase the node by one each time. We have analyzed three types of time. The real time, user time, system time is analyzed.
引用
收藏
页码:608 / 612
页数:5
相关论文
共 50 条
  • [1] Performance Modeling and Analysis of a Hadoop Cluster for Efficient Big Data Processing
    Lim, JongBeom
    Ahnh, Jong-Suk
    Lee, Kang-Woo
    [J]. ADVANCED SCIENCE LETTERS, 2016, 22 (09) : 2314 - 2319
  • [2] The Research of Massive Data Analysis and Processing Based on Hadoop
    Yi, Julan
    [J]. PROCEEDINGS OF THE 2015 3RD INTERNATIONAL CONFERENCE ON MACHINERY, MATERIALS AND INFORMATION TECHNOLOGY APPLICATIONS, 2015, 35 : 273 - 277
  • [3] Huge Data Analysis and Processing Platform based on Hadoop
    Li, Yuanbin
    Chen, Rong
    [J]. PROCEEDINGS OF THE 2017 2ND INTERNATIONAL CONFERENCE ON MATERIALS SCIENCE, MACHINERY AND ENERGY ENGINEERING (MSMEE 2017), 2017, 123 : 267 - 271
  • [4] Big Data Analysis Using Hadoop Cluster
    Saldhi, Ankita
    Goel, Abhinav
    Yadav, Dipesh
    Saldhi, Ankur
    Saksena, Dhruv
    Indu, S.
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (IEEE ICCIC), 2014, : 572 - 575
  • [5] Processing and Analysis of Seismic data in Hadoop Platform
    Chen, Zhuang
    Zhang, Ti
    [J]. 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE & COMMUNICATION TECHNOLOGY (CICT), 2017,
  • [6] Choosing Optimal Maintenance Time for Stateless Data-Processing Clusters A Case Study of Hadoop Cluster
    Zhuang, Zhenyun
    Shen, Min
    Ramachandra, Haricharan
    Viswesan, Suja
    [J]. JOB SCHEDULING STRATEGIES FOR PARALLEL PROCESSING, JSSPP 2016, 2017, 10353 : 252 - 273
  • [7] Hadoop Cluster Monitoring and Fault Analysis in Real Time
    Pinto, Joey
    Jain, Pooja
    Kumar, Tapan
    [J]. 2016 INTERNATIONAL CONFERENCE ON RECENT ADVANCES AND INNOVATIONS IN ENGINEERING (ICRAIE), 2016,
  • [8] Hadoop Based Scalable Cluster Deduplication for Big Data
    Liu, Qing
    Fu, Yinjin
    Ni, Guiqiang
    Hou, Rui
    [J]. 2016 IEEE 36TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS WORKSHOPS (ICDCSW 2016), 2016, : 98 - 105
  • [9] Image Processing on MultiNode Hadoop Cluster
    Sachdeva, Karan
    Kaur, Jaideep
    Singh, Gursimran
    [J]. 2017 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, COMMUNICATION, COMPUTER, AND OPTIMIZATION TECHNIQUES (ICEECCOT), 2017, : 21 - 26
  • [10] Performance Analysis of Hadoop-Based SQL and NoSQL for Processing Log Data
    Son, Siwoon
    Gil, Myeong-Seon
    Moon, Yang-Sae
    Won, Hee-Sun
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2015, 2015, 9052 : 293 - 299