A BigData MapReduce Hadoop Distribution Architecture for Processing Input Splits to solve the Small Data Problem

被引:0
|
作者
Manjunath, R. [1 ]
Tejus [1 ]
Channabasava, R. K. [1 ]
Balaji, S. [2 ]
机构
[1] City Engn Coll, Dept CSE, Hyderabad, Andhra Pradesh, India
[2] Jain Univ, Bengaluru, India
关键词
Hadoop; MapReduce; input splits;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Hadoop deals with big data which is an open source java framework. There are two core components in it namely: HDFS (Hadoop distributed file system) is the ability of a system to continue normal operation against hardware or software faults using inexpensive hardware and which stocks huge extent of data another one is MapReduce is a processing technique and programming model done in lateral and scattered manner. Hadoop does not perform well for short data because huge amount of short data could be greater task on the NameNode of HDFS which inturn its execution time is prolonged for which MapReduce is encountered. While dealing with great amount of short data as it is particularly designed to handle huge amount of data, hadoop experienced with a performance cost. This analysis permits the indetail description of HDFS, actual ways to deal with the problems along with proposed approach to handle short data files and short data file problems. In proposed approach, small files are merged using programming model on hadoop known as MapReduce. By this approach of Hadoop performance of handling small files which is larger than block size is improved. We also propose a Traffic analyzer with the combination of Hadoop and Map-Reduce paradigm. The joint of Hadoop and MapReduce programming tools makes it possible to provide batch analysis in minimum response time and in memory computing capacity in order to process log in a high available, efficient and stable way.
引用
收藏
页码:480 / 487
页数:8
相关论文
共 31 条
  • [1] Architecture of Efficient Word Processing using Hadoop MapReduce for Big Data Applications
    Mandal, Bichitra
    Sahoo, Ramesh Kumar
    Sethi, Srinivas
    [J]. PROCEEDINGS 2015 INTERNATIONAL CONFERENCE ON MAN AND MACHINE INTERFACING (MAMI), 2015,
  • [2] Efficient Big Data Processing in Hadoop MapReduce
    Dittrich, Jens
    Quiane-Ruiz, Jorge-Arnulfo
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (12): : 2014 - 2015
  • [3] Hadoop MapReduce for Parallel Genetic Algorithm to Solve Traveling Salesman Problem
    Manzi, Entesar
    Bennaceur, Hachemi
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (08) : 97 - 107
  • [4] An Overview of Hadoop MapReduce, Spark, and Scalable Graph Processing Architecture
    Talan, Pooja P.
    Sharma, Kartik U.
    Nawade, Pratiksha P.
    Talan, Karishma P.
    [J]. RECENT DEVELOPMENTS IN MACHINE LEARNING AND DATA ANALYTICS, 2019, 740 : 35 - 42
  • [5] An overview and an Approach for Graph Data Processing using Hadoop MapReduce
    Talan, Pooja P.
    Sharma, Kartik U.
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTING METHODOLOGIES AND COMMUNICATION (ICCMC 2018), 2018, : 59 - 63
  • [6] Big Data Processing with harnessing Hadoop - MapReduce for Optimizing Analytical Workloads
    Satish, Rama K., V
    Kavya, N. P.
    [J]. 2014 INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING AND INFORMATICS (IC3I), 2014, : 49 - 54
  • [7] Big Data Management Processing with Hadoop MapReduce and Spark Technology: A Comparison
    Verma, Ankush
    Mansuri, Ashik Hussain
    Jain, Neelesh
    [J]. 2016 SYMPOSIUM ON COLOSSAL DATA ANALYSIS AND NETWORKING (CDAN), 2016,
  • [8] A Comparison of Big Remote Sensing Data Processing with Hadoop MapReduce and Spark
    Chebbi, I.
    Boulila, W.
    Mellouli, N.
    Lamolle, M.
    Farah, I. R.
    [J]. 2018 4TH INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR SIGNAL AND IMAGE PROCESSING (ATSIP), 2018,
  • [9] Hadoop-EDF: Large-scale Distributed Processing of Electrophysiological Signal Data in Hadoop MapReduce
    Wu, Yuanyuan
    Li, Xiaojin
    Liu, Jinze
    Cui, Licong
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 2265 - 2271
  • [10] Processing of Medical Different Types of Data Using Hadoop and Java']Java MapReduce
    Boyko, Nataliya
    Tkachuk, Nazar
    [J]. IDDM 2020: PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON INFORMATICS & DATA-DRIVEN MEDICINE, 2020, 2753